MongoDB Schema Query Review

Review MongoDB schemas and queries for document modeling mistakes, index gaps, and aggregation anti-patterns.

12 views

Cursor

mongodbmongoosenosqldocument-modelingaggregationindexingdatabasejavascripttypescript

How to Use

Save as .cursor/rules/mongodb-schema-query-review.mdc with globs targeting your schema and query files, e.g., models/**, schemas/**, **/aggregations/**, and any files matching *.mongo.js or *.pipeline.ts. Activate by opening a Mongoose model, a MongoDB aggregation pipeline file, or a migration script, then running Cursor agent mode. Invoke manually in chat with @mongodb-schema-query-review when reviewing collection design or query performance. Verify installation under Cursor Settings > Rules and confirm the rule appears with correct glob patterns. Test by opening a Mongoose schema with an unbounded array field and confirming the agent flags it as Critical.

Agent Definition

Review MongoDB collection schemas, Mongoose/Mongosh definitions, and aggregation pipelines for document modeling mistakes, missing or misaligned indexes, and query patterns that degrade at scale.

Document Modeling

Favor embedding when the child data is read together with the parent and the array will not grow unbounded. Reference with a separate collection when the related data is queried independently, updated frequently, or can exceed 100 elements per document.

Flag any array field that lacks an explicit size constraint or TTL-based cleanup. An unbounded array is the single most common cause of document bloat and degraded write performance. When you find one, recommend either bucketing into a separate collection with a parent reference, or capping with $slice on writes.

Polymorphic documents sharing a collection must include a discriminator field indexed and used in every query filter. Without it, queries scan the full collection regardless of document type.

Avoid deeply nested subdocuments beyond two levels. Each nesting level complicates atomic updates and makes partial indexes unusable. Flatten to a single embedded level or extract to a referenced collection.

Index Strategy

Every query pattern must have a supporting index. Inspect find, aggregate $match, and $sort stages. If a query filters on fields A and B and sorts on C, the compound index must follow the Equality-Sort-Range rule: {A: 1, B: 1, C: 1} when A and B are equality filters and C is the sort key.

Flag indexes that duplicate a prefix of another compound index. MongoDB can use a compound index to satisfy queries on any prefix of its fields, so a standalone index on {A: 1} is redundant when {A: 1, B: 1} exists.

Partial indexes should be used for queries that always include a specific filter condition (e.g., {status: "active"}). This reduces index size and write overhead.

TTL indexes must target a Date field. Flag any TTL index on a non-Date field since it will silently never expire documents.

Aggregation Pipelines

Place $match and $project stages as early as possible. A $match immediately after the collection scan uses indexes; a $match after $unwind or $group does not.

Flag $lookup stages that lack a supporting index on the foreign collection's localField/foreignField. An unindexed $lookup performs a collection scan per input document.

Avoid $unwind on large arrays followed by $group to re-aggregate. This pattern explodes the pipeline's memory usage. Prefer $reduce, $filter, or $map within a $project stage when the goal is array transformation rather than cross-document grouping.

$group with $push that accumulates unbounded results risks exceeding the 100MB per-stage memory limit. Add allowDiskUse only as a last resort; prefer $limit or $bucket to constrain output.

Write Patterns

Flag update operations using positional $ operator on nested arrays. MongoDB supports only one level of array positional matching. For nested array updates, use arrayFilters with explicit identifier syntax.

Bulk writes should use ordered: false when individual operation order does not matter. This allows MongoDB to parallelize and continue past individual failures.

Replace findAndModify with findOneAndUpdate with returnDocument: "after" for atomic read-modify-write. Flag any pattern that reads a document, modifies in application code, then saves back, since this creates a race condition under concurrent writes.

Severity Levels

Critical -- unbounded arrays without size management, missing indexes on high-frequency query patterns, unindexed $lookup in aggregation, race-prone read-modify-write without atomic operations.

Warning -- redundant indexes duplicating compound prefixes, $unwind followed by $group on large arrays, deeply nested subdocuments beyond two levels, polymorphic collections without discriminator index.

Suggestion -- partial index opportunities on filtered queries, $project stage ordering improvements, bulk write ordered flag optimization, TTL index candidates for temporal data.