Visualizes datasets in 2D using embeddings with UMAP or t-SNE dimensionality reduction...
ALWAYS follow these rules:
set_context(dataset_name="my-dataset")
Brain operators are delegated and require the app:
launch_app()
Wait 5-10 seconds for initialization.
# List all brain operators
list_operators(builtin_only=False)
# Get schema for specific operator
get_operator_schema(operator_uri="@voxel51/brain/compute_visualization")
Embeddings are required for dimensionality reduction:
execute_operator(
operator_uri="@voxel51/brain/compute_similarity",
params={
"brain_key": "img_sim",
"model": "clip-vit-base32-torch",
"embeddings": "clip_embeddings",
"backend": "sklearn",
"metric": "cosine"
}
)
close_app()
# Set context
set_context(dataset_name="my-dataset")
# Launch app (required for brain operators)
launch_app()
# Check if brain plugin is available
list_plugins(enabled=True)
# If not installed:
download_plugin(
url_or_repo="voxel51/fiftyone-plugins",
plugin_names=["@voxel51/brain"]
)
enable_plugin(plugin_name="@voxel51/brain")
# List all available operators
list_operators(builtin_only=False)
# Get schema for compute_visualization
get_operator_schema(operator_uri="@voxel51/brain/compute_visualization")
First, check if the dataset already has embeddings by looking at the operator schema:
get_operator_schema(operator_uri="@voxel51/brain/compute_visualization")
# Look for existing embeddings fields in the "embeddings" choices
# (e.g., "clip_embeddings", "dinov2_embeddings")
If embeddings exist: Skip to Step 5 and use the existing embeddings field.
If no embeddings exist: Compute them:
execute_operator(
operator_uri="@voxel51/brain/compute_similarity",
params={
"brain_key": "img_viz",
"model": "clip-vit-base32-torch",
"embeddings": "clip_embeddings", # Field name to store embeddings
"backend": "sklearn",
"metric": "cosine"
}
)
Required parameters for compute_similarity:
brain_key - Unique identifier for this brain runmodel - Model from FiftyOne Model Zoo to generate embeddingsembeddings - Field name where embeddings will be storedbackend - Similarity backend (use "sklearn")metric - Distance metric (use "cosine" or "euclidean")Recommended embedding models:
clip-vit-base32-torch - Best for general visual + semantic similaritydinov2-vits14-torch - Best for visual similarity onlyresnet50-imagenet-torch - Classic CNN featuresmobilenet-v2-imagenet-torch - Fast, lightweight optionUse existing embeddings field OR the brain_key from Step 4:
# Option A: Use existing embeddings field (e.g., clip_embeddings)
execute_operator(
operator_uri="@voxel51/brain/compute_visualization",
params={
"brain_key": "img_viz",
"embeddings": "clip_embeddings", # Use existing field
"method": "umap",
"num_dims": 2
}
)
# Option B: Use brain_key from compute_similarity
execute_operator(
operator_uri="@voxel51/brain/compute_visualization",
params={
"brain_key": "img_viz", # Same key used in compute_similarity
"method": "umap",
"num_dims": 2
}
)
Dimensionality reduction methods:
umap - (Recommended) Preserves local and global structure, faster. Requires umap-learn package.tsne - Better local structure, slower on large datasets. No extra dependencies.pca - Linear reduction, fastest but less informativeAfter computing visualization, direct the user to open the FiftyOne App at http://localhost:5151/ and:
img_viz) from the dropdownground_truth, predictions)IMPORTANT: Do NOT use set_view(exists=["brain_key"]) - this filters samples and is not needed for visualization. The Embeddings panel automatically shows all samples with computed coordinates.
To filter samples while viewing in the Embeddings panel:
# Filter to specific class
set_view(filters={"ground_truth.label": "dog"})
# Filter by tag
set_view(tags=["validated"])
# Clear filter to show all
clear_view()
These filters will update the Embeddings panel to show only matching samples.
Outliers appear as isolated points far from clusters:
# Compute uniqueness scores (higher = more unique/outlier)
execute_operator(
operator_uri="@voxel51/brain/compute_uniqueness",
params={
"brain_key": "img_viz"
}
)
# View most unique samples (potential outliers)
set_view(sort_by="uniqueness", reverse=True, limit=50)
Use the App's Embeddings panel to visually identify clusters, then:
Option A: Lasso selection in App
Option B: Use similarity to find cluster members
# Sort by similarity to a representative sample
execute_operator(
operator_uri="@voxel51/brain/sort_by_similarity",
params={
"brain_key": "img_viz",
"query_id": "sample_id_from_cluster",
"k": 100
}
)
close_app()
| Tool | Description |
|---|---|
set_view(filters={...}) |
Filter samples by field values |
set_view(tags=[...]) |
Filter samples by tags |
set_view(sort_by="...", reverse=True) |
Sort samples by field |
set_view(limit=N) |
Limit to N samples |
clear_view() |
Clear filters, show all samples |
Use list_operators() to discover and get_operator_schema() to see parameters:
| Operator | Description |
|---|---|
@voxel51/brain/compute_similarity |
Compute embeddings and similarity index |
@voxel51/brain/compute_visualization |
Reduce embeddings to 2D/3D for visualization |
@voxel51/brain/compute_uniqueness |
Score samples by uniqueness (outlier detection) |
@voxel51/brain/sort_by_similarity |
Sort by similarity to a query sample |
Visualize dataset structure and explore clusters:
set_context(dataset_name="my-dataset")
launch_app()
# Check for existing embeddings in schema
get_operator_schema(operator_uri="@voxel51/brain/compute_visualization")
# If embeddings exist (e.g., clip_embeddings), use them directly:
execute_operator(
operator_uri="@voxel51/brain/compute_visualization",
params={
"brain_key": "exploration",
"embeddings": "clip_embeddings",
"method": "umap", # or "tsne" if umap-learn not installed
"num_dims": 2
}
)
# Direct user to App Embeddings panel at http://localhost:5151/
# 1. Click Embeddings panel icon
# 2. Select "exploration" from dropdown
# 3. Use "Color by" to color by ground_truth or predictions
Identify anomalous or mislabeled samples:
set_context(dataset_name="my-dataset")
launch_app()
# Check for existing embeddings in schema
get_operator_schema(operator_uri="@voxel51/brain/compute_visualization")
# If no embeddings exist, compute them:
execute_operator(
operator_uri="@voxel51/brain/compute_similarity",
params={
"brain_key": "outliers",
"model": "clip-vit-base32-torch",
"embeddings": "clip_embeddings",
"backend": "sklearn",
"metric": "cosine"
}
)
# Compute uniqueness scores
execute_operator(
operator_uri="@voxel51/brain/compute_uniqueness",
params={"brain_key": "outliers"}
)
# Generate visualization (use existing embeddings field or brain_key)
execute_operator(
operator_uri="@voxel51/brain/compute_visualization",
params={
"brain_key": "outliers",
"embeddings": "clip_embeddings", # Use existing field if available
"method": "umap", # or "tsne" if umap-learn not installed
"num_dims": 2
}
)
# Direct user to App at http://localhost:5151/
# 1. Click Embeddings panel icon
# 2. Select "outliers" from dropdown
# 3. Outliers appear as isolated points far from clusters
# 4. Optionally sort by uniqueness field in the App sidebar
See how different classes cluster:
set_context(dataset_name="my-dataset")
launch_app()
# Check for existing embeddings in schema
get_operator_schema(operator_uri="@voxel51/brain/compute_visualization")
# If no embeddings exist, compute them:
execute_operator(
operator_uri="@voxel51/brain/compute_similarity",
params={
"brain_key": "class_viz",
"model": "clip-vit-base32-torch",
"embeddings": "clip_embeddings",
"backend": "sklearn",
"metric": "cosine"
}
)
# Generate visualization (use existing embeddings field or brain_key)
execute_operator(
operator_uri="@voxel51/brain/compute_visualization",
params={
"brain_key": "class_viz",
"embeddings": "clip_embeddings", # Use existing field if available
"method": "umap", # or "tsne" if umap-learn not installed
"num_dims": 2
}
)
# Direct user to App at http://localhost:5151/
# 1. Click Embeddings panel icon
# 2. Select "class_viz" from dropdown
# 3. Use "Color by" dropdown to color by ground_truth or predictions
# Look for:
# - Well-separated clusters = good class distinction
# - Overlapping clusters = similar classes or confusion
# - Scattered points = high variance within class
Compare ground truth vs predictions in embedding space:
set_context(dataset_name="my-dataset")
launch_app()
# Check for existing embeddings in schema
get_operator_schema(operator_uri="@voxel51/brain/compute_visualization")
# If no embeddings exist, compute them:
execute_operator(
operator_uri="@voxel51/brain/compute_similarity",
params={
"brain_key": "pred_analysis",
"model": "clip-vit-base32-torch",
"embeddings": "clip_embeddings",
"backend": "sklearn",
"metric": "cosine"
}
)
# Generate visualization (use existing embeddings field or brain_key)
execute_operator(
operator_uri="@voxel51/brain/compute_visualization",
params={
"brain_key": "pred_analysis",
"embeddings": "clip_embeddings", # Use existing field if available
"method": "umap", # or "tsne" if umap-learn not installed
"num_dims": 2
}
)
# Direct user to App at http://localhost:5151/
# 1. Click Embeddings panel icon
# 2. Select "pred_analysis" from dropdown
# 3. Color by ground_truth - see true class distribution
# 4. Color by predictions - see model's view
# 5. Look for mismatches to find errors
Use t-SNE for better local structure (no extra dependencies):
set_context(dataset_name="my-dataset")
launch_app()
# Check for existing embeddings in schema
get_operator_schema(operator_uri="@voxel51/brain/compute_visualization")
# If no embeddings exist, compute them (DINOv2 for visual similarity):
execute_operator(
operator_uri="@voxel51/brain/compute_similarity",
params={
"brain_key": "tsne_viz",
"model": "dinov2-vits14-torch",
"embeddings": "dinov2_embeddings",
"backend": "sklearn",
"metric": "cosine"
}
)
# Generate t-SNE visualization (no umap-learn dependency needed)
execute_operator(
operator_uri="@voxel51/brain/compute_visualization",
params={
"brain_key": "tsne_viz",
"embeddings": "dinov2_embeddings", # Use existing field if available
"method": "tsne",
"num_dims": 2
}
)
# Direct user to App at http://localhost:5151/
# 1. Click Embeddings panel icon
# 2. Select "tsne_viz" from dropdown
# 3. t-SNE provides better local cluster structure than UMAP
Error: "No executor available"
launch_app() was called and wait 5-10 secondsError: "Brain key not found"
compute_similarity first with a brain_keyError: "Operator not found"
download_plugin() and enable_plugin()Error: "You must install the umap-learn>=0.5 package"
umap-learn packagepip install umap-learnmethod to "tsne" (no extra dependencies)method to "pca" (fastest, no extra dependencies)Visualization is slow
mobilenet-v2-imagenet-torchset_view(limit=1000)Embeddings panel not showing
Points not colored correctly
list_operators() and get_operator_schema() to get current operator names and parametersbrain_keyEmbedding computation time:
Visualization computation time:
Memory requirements: