Smart cropping & image analysis

This guide covers the four ways Convertly’s image CDN can decide where and how to crop and how to feed it the metadata you already have from a CMS, design tool, or art director. It also covers the standalone analyse endpoint that emits focal points and palettes you can persist alongside every upload.

Side-by-side crop examples for gravity=auto, gravity=smart, gravity=face, and focal points live on Transforms → Resize, crop & fit. This page goes deeper on focal points, entropy crops, and the analyze API.

When to use which mode

Scenario	Recommended mode	Why
Default for unknown assets	`gravity=auto`	Fast saliency, great default.
Portrait / team photo / anything with people	`gravity=face`	Real face detection. Picks the highest-confidence face as the crop centre; falls through to smart crop if no face is found.
Subject is off-centre or close to an edge	`gravity=smart` (or `crop=smart`)	Runs full content-aware scoring; better than libvips for tricky layouts.
You already know where the subject is	`fp=x,y`	Pure pixel math. Zero extra cost on every render.
Marketing hero where art direction matters	`fp=x,y` + a one-time `/api/images/analyze` call to seed it	Storage cost is one row per asset; renders stay deterministic.
Detail / texture sources (no obvious subject)	`gravity=entropy`	Picks the densest region by entropy without looking for skin/saliency cues.
Hard-coded corner	`gravity=center`, `gravity=northwest`, etc.	When the source is already composed for you.

When auto is not enough

On off-centre hero images, gravity=auto (default saliency) can miss the subject. gravity=smart runs a heavier content-aware pass that scores detail, saturation, and skin tone across the frame.

Center gravity crop vs smart gravity crop on an off-centre aerial photo — Off-centre hero — `gravity=center` (left) vs `gravity=smart` (right).

For the full gravity comparison grid (auto, smart, face, center), see Transforms → Resize, crop & fit.

Manual focal points (`fp=x,y`)

fp=x,y accepts a normalised coordinate where (0, 0) is the top-left of the source and (1, 1) is the bottom-right. Both percentages and decimals work:

?w=600&h=400&fp=0.3,0.4    # decimals in [0, 1]
?w=600&h=400&fp=30%,40%    # percentages

Convertly takes a pre-resize extract whose aspect ratio matches the requested w/h, then resizes that extract down. The focal point lands at the centre of the crop window, clamped if the requested aspect would push the window past a source edge. This is the cheapest crop mode — it’s pure pixel math with no analysis pass — and it’s the only mode that’s deterministic across renditions. If you serve the same image at 1200×800, 600×400, and 300×200 with the same focal point, all three crops centre on the same physical pixel.

Wide hero image with a focal point marker on the subject — Source hero with suggested focal point from `/api/images/analyze` (purple marker).

Center crop vs focal point crop side by side — `gravity=center` (left) clips the subject. `fp=x,y` (right) keeps it centred.

<img
  src="https://cdn.convertly.sh/assets-acme/{fileIdOrSlug}?w=1200&h=800&fp=0.4,0.35"
  srcset="
    https://cdn.convertly.sh/assets-acme/{fileIdOrSlug}?w=600&h=400&fp=0.4,0.35  600w,
    https://cdn.convertly.sh/assets-acme/{fileIdOrSlug}?w=900&h=600&fp=0.4,0.35  900w,
    https://cdn.convertly.sh/assets-acme/{fileIdOrSlug}?w=1200&h=800&fp=0.4,0.35 1200w
  "
  sizes="(min-width: 768px) 50vw, 100vw"
  alt="Product hero"
/>

Face-aware cropping (`gravity=face`)

gravity=face runs face detection on the source and uses the highest-confidence face’s centre as the crop focus. If multiple faces are detected, the box with the highest confidence wins - group shots tend to lock onto whoever the model is most confident about, which is usually the closest or most-foreground person. If no face is detected (most product shots, landscapes, screenshots), the request transparently falls through to gravity=smart, then to the saliency strategy. So gravity=face is always at least as good as gravity=auto - it just adds a face-aware step on top.

<img src="https://cdn.convertly.sh/assets-acme/{fileIdOrSlug}?w=400&h=400&fit=cover&gravity=face" alt="Team headshot" />

See gravity=face for a side-by-side crop example. Detection cost depends on source size and cache state. Like every other smart-crop mode, this cost is paid once per cache miss - once the output bytes are cached, subsequent requests serve the cached render.

Content-aware smart cropping (`gravity=smart`)

gravity=smart (or its alias crop=smart) runs the smartcrop algorithm on the source. smartcrop scores every candidate window for three signals — pixel-level detail, saturation, and skin-tone presence — then returns the rectangle with the highest aggregate score for the requested aspect ratio. It’s slower than the saliency strategy (gravity=auto) by ~50–100ms on a typical 1080p source, but it picks the right region noticeably more often on:

Hero shots where the subject is off-centre
Portraits with significant negative space
Product photography with the item in a corner
Editorial imagery where the visual weight is asymmetric

<img src="https://cdn.convertly.sh/assets-acme/{fileIdOrSlug}?w=600&h=600&fit=cover&gravity=smart" alt="…" />

The pass runs once per cache miss. Once the resulting bytes are cached, every subsequent request serves the already-encoded image with no analysis cost — so a single popular hero pays the smart-crop cost once and serves it billions of times for free. See gravity=smart for a side-by-side comparison with gravity=center.

Entropy cropping (`gravity=entropy`)

gravity=entropy picks the region with the highest pixel entropy — detail density — without saliency or face cues. Use it when there is no obvious subject: product flat-lays, fabric swatches, maps, or texture photography where you want the busiest area in frame.

?w=600&h=600&fit=cover&gravity=entropy

Entropy-based crop on a dense texture flat-lay — Texture source — `gravity=entropy` keeps the most detailed region in frame.

Combining focal points with the API

The recommended pattern at scale:

On upload, call /api/images/analyze once. Convertly persists the result on the stored file as metadata.image_analysis.
On render, either pass fp=x,y explicitly or omit fp — the CDN reads stored analysis automatically for fit=cover crops on Convertly Storage files.
Art director override: update the stored value (or pass fp=x,y on the URL). URLs stay deterministic.

This keeps the hot path zero-cost while letting you serve smart-cropped renditions to every device.

CDN focal discovery

Param	Use when
`fp=x,y`	You know the focal point. Cheapest — pure pixel math.
(omit `fp`)	File was analyzed via API — CDN uses stored focal for `fit=cover` automatically.
`fp=auto`	Recompute focal inline (face → smart-crop) for the requested aspect. Ignores stored analysis.
`focal=json`	Return focal metadata as JSON instead of an image — useful for CMS previews and debugging.

Append focal=json after your resize params. The CDN resolves focal in priority order: explicit fp=x,y → stored image_analysis → inline face detection → smart-crop → centre.

curl "https://cdn.convertly.sh/{key}/{id}?w=640&h=320&fit=cover&focal=json"

Example response:

{
  "x": 0.42,
  "y": 0.35,
  "confidence": 0.81,
  "source": "metadata",
  "cached": true,
  "analyzedAt": "2026-06-20T12:00:00.000Z",
  "width": 4000,
  "height": 2667
}

source is one of manual, metadata, face, smart-crop, or center. When cached is true, the value came from a prior /api/images/analyze run on the same file. fp=auto is the lazy mode when you have not analyzed yet:

?w=600&h=400&fit=cover&fp=auto

The CDN runs face detection, then smart-crop for the requested aspect, and crops around the result. Cost is paid once per cache miss, same as gravity=smart.

focal=json returns focal metadata only — not palette or face boxes. Use POST /api/images/analyze for full analysis at upload time.

The analyze endpoint

POST /api/images/analyze returns the focal point smartcrop would pick, the suggested crop window in pixel coordinates, and a full colour palette. Use it once per upload, persist the result, and reuse it forever.

API Reference

OpenAPI spec and interactive playground for POST /api/images/analyze.

Authenticate

Same auth as every other media-tools API: a regular API key (cvly_…) or a logged-in session cookie. CDN signing keys are not accepted on this endpoint — it’s a workspace-write path, not a public CDN one.

Two request shapes

Stored file (recommended)

If the image is already in Convertly Storage, send a JSON body with the file’s id. No upload needed.

curl -X POST "https://convertly.sh/api/images/analyze" \
  -H "Authorization: Bearer $CONVERTLY_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{ "fileId": "01h-xxx-...-uuid" }'

Fresh upload

If the image isn’t stored yet (onboarding, one-off analysis, “what colour is this?” calls), use multipart:

curl -X POST "https://convertly.sh/api/images/analyze" \
  -H "Authorization: Bearer $CONVERTLY_API_KEY" \
  -F "file=@./hero.jpg"

Multipart uploads are capped at 25 MB.

What analyze returns visually

Analyze runs face detection and palette extraction on the source once. Persist the JSON next to the file — every CDN render with fp=x,y stays deterministic and free of per-request analysis.

Face analysis boxes beside a face-aware crop result — Before: face analysis boxes. After: face-aware crop centered on the detected group.

Hero image with Original palette and Dominance palette swatch rows — Original and dominance palette swatches — two labelled rows below the image.

Colour palette extraction

Convertly supports two palette workflows:

Workflow	When to use
`POST /api/images/analyze`	Once per upload — persist focal + palette JSON in your CMS for theming
CDN `?palette=json\|css`	On demand after transforms — palette from the rendered image

CDN palette

Append palette=json or palette=css to any CDN URL after your resize, crop, and adjustment params. The CDN runs the full image pipeline, rasterises to PNG internally, extracts colours, and returns JSON or CSS instead of an image.

<!-- CSS utility classes for theming -->
<link
  rel="stylesheet"
  href="https://cdn.convertly.sh/{key}/{id}?w=640&h=320&fit=crop&palette=css&prefix=hero"
/>

<!-- JSON for programmatic theming -->
<link
  rel="preload"
  as="fetch"
  href="https://cdn.convertly.sh/{key}/{id}?w=640&h=320&fit=crop&palette=json"
  crossorigin
/>

Landscape crop with Original palette and Dominance palette swatch rows below the image — CDN `?w=640&h=320&fit=crop&palette=json` — Original palette (named swatches) and Dominance palette (population-sorted).

Param	Value	Default
`palette`	`json` or `css`	—
`colors`	Max swatches in the population-sorted list (`0` = all named).	`6`
`prefix`	CSS class prefix when `palette=css` (e.g. `hero-bg-1`).	`image`

palette=json returns average_luminance, colors, dominant_colors, and dominant. palette=css emits .{prefix}-fg-N / .{prefix}-bg-N rules plus white/black exception classes. Each preview shows two swatch rows:

Original palette — named swatches (vibrant, muted, light/dark variants).
Dominance palette — the same colours sorted by pixel population (strongest first).

Palette extraction runs after transforms, so a warmed or cropped render produces different swatches than the stored original. Combine with sat, hue, crop, bgRemove, etc.

Warm-filtered landscape with palette swatch rows below — After `?sat=35&hue=18&con=92` — dominance swatches shift with the render.

Not combinable with format=svg&vectorize=gradient output — request palette on a raster format (webp, png) instead.

Analyze API (persist at upload)

Call POST /api/images/analyze once per upload, persist the JSON, and theme your UI from the swatches without a per-page CDN palette request. Each swatch includes:

Field	Use
`hex` / `rgb`	CSS variables, buttons, borders
`population`	How dominant the colour is in the image
`bodyTextOnColor`	`"#ffffff"` or `"#000000"` — readable body text on that swatch

The doc preview shows two rows below the image:

Original palette — vibrant, muted, and light/dark variants (Material Design / node-vibrant categories).
Dominance palette — the same swatches sorted by pixel weight so the strongest colours read left-to-right.

Palettes follow your transforms

For CDN palette, pass the same transform params on the URL — the swatches reflect the rendered output:

curl "https://cdn.convertly.sh/{key}/{id}?w=800&sat=35&hue=18&con=92&palette=json"

For analyze, the endpoint runs on the bytes you send — usually the stored original. If you want palette data that matches a styled render without CDN palette, fetch that render first, then POST those bytes:

Landscape with warm filter and Original and Dominance palette swatches below — Analyze on a warmed CDN render — Original and Dominance palette rows below the image.

# 1. Render with transforms (your app or CDN)
# 2. POST the rendered bytes — or analyse the original and apply swatches to the styled card in UI

curl -X POST "https://convertly.sh/api/images/analyze" \
  -H "Authorization: Bearer $CONVERTLY_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{ "fileId": "01h-xxx-...-uuid" }'

SVG and analyse

Stored SVG files are accepted (image/svg+xml passes the image/* check). Internally the file is rasterised before quantisation, so palette extraction works on illustrated SVGs and icons.

Input	Palette quality	Notes
Photos / JPEG / WebP	Best	Full vibrant/muted spectrum
Flat icons / single-colour SVG	Sparse	May return one or two swatches; face detection usually empty
Complex SVG illustrations	Good	Colours come from the rasterised preview

Smart crop and face detection on pure logo SVGs are usually low-confidence — expect focal.suggested.source: "center" for flat artwork.

Response

{
  "width": 1920,
  "height": 1080,
  "focal": {
    "suggested": { "x": 0.42, "y": 0.38, "confidence": 0.92, "source": "face" },
    "crop": { "left": 320, "top": 0, "width": 1080, "height": 1080 }
  },
  "faces": [
    { "x": 740, "y": 280, "width": 220, "height": 220, "score": 0.94 },
    { "x": 1180, "y": 320, "width": 180, "height": 180, "score": 0.71 }
  ],
  "palette": {
    "vibrant":      { "hex": "#a23f1c", "rgb": [162, 63, 28], "population": 1240, "bodyTextOnColor": "#ffffff" },
    "muted":        { "hex": "#7a6a5e", "rgb": [122, 106, 94], "population": 980, "bodyTextOnColor": "#ffffff" },
    "lightVibrant": { "hex": "#f1c597", "rgb": [241, 197, 151], "population": 620, "bodyTextOnColor": "#000000" },
    "darkVibrant":  { "hex": "#3a1607", "rgb": [58, 22, 7], "population": 510, "bodyTextOnColor": "#ffffff" },
    "lightMuted":   { "hex": "#d4c4b1", "rgb": [212, 196, 177], "population": 740, "bodyTextOnColor": "#000000" },
    "darkMuted":    { "hex": "#403832", "rgb": [64, 56, 50], "population": 410, "bodyTextOnColor": "#ffffff" },
    "dominant":     { "hex": "#a23f1c", "rgb": [162, 63, 28], "population": 1240, "bodyTextOnColor": "#ffffff" }
  }
}

Field notes:

focal.suggested.{x, y} is normalised to [0, 1]. Paste straight into ?fp=x,y.
focal.suggested.confidence is in [0, 1]. Values under ~0.3 mean the algorithm couldn’t find a strong subject; treat them as “centre crop is probably fine.”
focal.suggested.source tells you which backend produced the focal point:
- face — a face was detected; the focal point is the centre of the highest-confidence face box. Highly trustworthy.
- smart-crop — no face found; smartcrop’s content-aware pick was used.
- center — both passes failed (very small image, single-colour, or unsupported format); the centre is the safe fallback.
focal.crop is the suggested square crop window in source pixel coordinates. Useful if you want to physically store a thumbnail rather than render one on demand.
faces is the full list of detected face boxes, sorted by score descending. Each box is in source-image pixel coordinates with a confidence score. Empty array = no faces detected (a successful run); the field is present even when gravity=face would fall through to smart crop.
palette.dominant is whichever named swatch has the highest pixel population. Renders nicely as the card-background colour in product UIs.
palette.*.bodyTextOnColor is a WCAG-style “white or black for body text?” hint.

Quota

One /api/images/analyze call = 1 media operation against your plan’s monthly quota. All plans include the endpoint. See Limits for media API request allowances.

Platform fit

Convertly targets teams that want one platform for storage, conversion, compression, CDN delivery, and optional Forma AI. The tables below summarize what the Image CDN supports today.

Cropping, focal points, and analysis

Capability	Convertly
Manual focal point in URL	`fp=x,y` (normalised or percentages)
Inline focal recompute	`fp=auto` (face → smart-crop for requested aspect)
Stored focal auto-apply	Omit `fp` on analyzed Convertly Storage files with `fit=cover`
Focal metadata on CDN URL	`?focal=json`
Content-aware crop	`gravity=smart` or `crop=smart`
Saliency / auto crop	`gravity=auto` (default)
Face-aware crop	`gravity=face` (falls through to smart)
Entropy / detail crop	`gravity=entropy`
Persist focal + palette at upload	`POST /api/images/analyze` (included quota)
Palette on CDN URL	`?palette=json`, `?palette=css&prefix=hero`
Face boxes in API response	`/api/images/analyze` → `faces[]`

Format, delivery, and platform

Capability	Convertly
Format negotiation (WebP / AVIF)	`format=auto`
Progressive JPEG	`jpgProgressive` (default on)
Custom CDN hostname	Custom domain
Named transform presets	`?preset=` or `/p/{name}`
Signed premium transforms	HMAC `s=` + optional `exp=`
Origin fetch (no re-upload)	Origin sources
Video poster / clip trim	CDN `t`, `so`, `du` — video transforms
Bundled media conversion API	Convert, compress, workflows

AI, privacy, and SVG

Capability	Convertly
Background removal	`bgRemove=1` — all plans
Super-resolution	`upscale=2	4`free,`upscale=ai` signed
Generative fill / replace	`fill=gen`, `bgReplace` (Forma, signed)
Face and plate blur / pixelate	`blurFaces`, `pixelateFaces`, `blurPlates`, `pixelatePlates`, regions
SVG sanitize + rasterize	`svgSanitize` (default on)
SVG path recolour	`svgColor=RRGGBB`
Raster → SVG trace	`format=svg&vectorize=gradient`

Metering (high level)

	Convertly
CDN delivery	Monthly image CDN origin requests by plan
Analysis / palette	1 media op per `/api/images/analyze` call
AI transforms	Forma AI unit quota; `bgRemove` included on all plans

Convertly keeps transforms URL-first so storage, CDN delivery, and optimization can live in the same workflow.

Where Convertly fits best

Strong fit: Product teams that want resize, crop, format auto, watermarks, palette extraction, and ML background removal from flat transform URLs — plus origin-backed delivery without re-uploading every asset.
Evaluate carefully: Deep proprietary DAM workflows or apps that depend on a single-vendor upload widget and folder UI — Convertly offers folders and dashboard tools with a different model; plan a hybrid origin period if needed.

Use fp=x,y when you already know the subject position, or call /api/images/analyze when you want Convertly to suggest focal points and face boxes automatically.

Getting Started

Image CDN

Media Platform

SDKs & Integrations

Resources

Smart cropping & image analysis

When to use which mode

When auto is not enough

Manual focal points (`fp=x,y`)

Face-aware cropping (`gravity=face`)

Content-aware smart cropping (`gravity=smart`)

Entropy cropping (`gravity=entropy`)

Combining focal points with the API

CDN focal discovery

The analyze endpoint

API Reference

Authenticate

Two request shapes

Stored file (recommended)

Fresh upload

What analyze returns visually

Colour palette extraction

CDN palette

Analyze API (persist at upload)

Palettes follow your transforms

SVG and analyse

Response

Quota

Platform fit

Cropping, focal points, and analysis

Format, delivery, and platform

AI, privacy, and SVG

Metering (high level)

Where Convertly fits best

​When to use which mode

​When auto is not enough

​Manual focal points (fp=x,y)

​Face-aware cropping (gravity=face)

​Content-aware smart cropping (gravity=smart)

​Entropy cropping (gravity=entropy)

​Combining focal points with the API

​CDN focal discovery

​The analyze endpoint

API Reference

​Authenticate

​Two request shapes

​Stored file (recommended)

​Fresh upload

​What analyze returns visually

​Colour palette extraction

​CDN palette

​Analyze API (persist at upload)

​Palettes follow your transforms

​SVG and analyse

​Response

​Quota

​Platform fit

​Cropping, focal points, and analysis

​Format, delivery, and platform

​AI, privacy, and SVG

​Metering (high level)

​Where Convertly fits best

When to use which mode

When auto is not enough

Manual focal points (`fp=x,y`)

Face-aware cropping (`gravity=face`)

Content-aware smart cropping (`gravity=smart`)

Entropy cropping (`gravity=entropy`)

Combining focal points with the API

CDN focal discovery

The analyze endpoint

Authenticate

Two request shapes

Stored file (recommended)

Fresh upload

What analyze returns visually

Colour palette extraction

CDN palette

Analyze API (persist at upload)

Palettes follow your transforms

SVG and analyse

Response

Quota

Platform fit

Cropping, focal points, and analysis

Format, delivery, and platform

AI, privacy, and SVG

Metering (high level)

Where Convertly fits best