Skip to main content

MongoDB BulkWrite Java API

Since version 3.2, MongoDB has introduced Bulk Update methods. In context of RDBMS, it's like SQL Batch Jobs, where SQL Statements are prepared in different chunks and a batch of statements are submitted to DB for update/insert.

Here are some important points about MongoDB Bulk Write operation..


  1. Useful in case you've huge data to update/insert.
  2. Mongo automatically prepares batches (of 1000 default) and start execution in an ordered/unordered manner.
  3. This drastically reduce DB trip time. Let's say there are 50 thousand records to update, now instead of 50k round trips to DB from your app server, using Bulk Update it would be reduced to just 50 round trips.

Let's see an example below:


List<WriteModel<Document>> updateDocuments = new ArrayList<WriteModel<Document>>();
for (Long entityId : entityIDs) {

    //Finder doc
    Document filterDocument = new Document();
    filterDocument.append("_id", entityId);

    //Update doc
    Document updateDocument = new Document();
    Document setDocument = new Document();
    setDocument.append("name", "xyz");
    setDocument.append("role", "abc");

    updateDocument.append("$set", setDocument);

    //Update option
    UpdateOptions updateOptions = new UpdateOptions();
    updateOptions.upsert(true); //if true, will create a new doc in case of unmatched find
    updateOptions.bypassDocumentValidation(true); //set true/false

    //Prepare list of Updates
    updateDocuments.add(
            new UpdateOneModel<Document>(
                    filterDocument,
                    updateDocument,
                    updateOptions));

}

//Bulk write options
BulkWriteOptions bulkWriteOptions = new BulkWriteOptions();
bulkWriteOptions.ordered(false); //False to allow parallel execution
bulkWriteOptions.bypassDocumentValidation(true);

MongoCollection<Document> mongoCollection = mongoDB.getCollection("myCollection");

BulkWriteResult bulkWriteResult = null;
try {
    //Perform bulk update
    bulkWriteResult = mongoCollection.bulkWrite(updateDocuments,
            bulkWriteOptions);
} catch (BulkWriteException e) {
    //Handle bulkwrite exception
    List<BulkWriteError> bulkWriteErrors = e.getWriteErrors();
    for (BulkWriteError bulkWriteError : bulkWriteErrors) {
        int failedIndex = bulkWriteError.getIndex();
        Long failedEntityId = entityIDs.get(failedIndex);
        System.out.println("Failed record: " + failedEntityId);
        //handle rollback
    }
}

int rowsUpdated = bulkWriteResult.getModifiedCount();


Now let's understand the process..

entityIDs: List of _id s to update

filterDocument: query filter. equivalent to SQL where clause

setDocuments: values to update. equivalent to SQL set statement

updateOptions: manner in which update should happen. 

bulkWriteOptions: write operation preferences. If entityIDs are independent of each other, you should go for un-ordered execution, simply like parallel threads.


bulkWriteErrors: Errors if any during update process 


for Bulk deletion, we just need to prepare DeleteOneModel instead of UpdateOneModel documents, rest would be same

List<WriteModel<Document>> deleteDocuments = new ArrayList<WriteModel<Document>>();
for (Long entityId : entityIDs) {

    //Finder doc
    Document filterDocument = new Document();
    filterDocument.append("_id", entityId);

    //Delete doc
    Document deleteDocument = new DeleteOneModel<Document>(filterDocument);
    //Prepare list of Deletes
    deleteDocuments.add(deleteDocument);
}


Comments

Post a Comment

Popular posts from this blog

MongoDB fetch operation using Java API

With changes in MongoDB, there has been several changes in its Java API as well. There are now some good and easy way to perform different operation with DB. MongoDB Java API is really very simple and easy to understand. If we understand the basics, we can build up simple to complex queries. Let's take a look at the example of MongoDB fetch operation using the new Java API and then we'll try to understand the basics.. public MyEntity findMyEntityById ( long entityId ) { List < Bson > queryFilters = new ArrayList <>(); queryFilters . add ( Filters . eq ( "_id" , entityId )); Bson searchFilter = Filters . and ( queryFilters ); //Fields to return. //_id is available by default. A value of "0" would skip it List < Bson > returnFilters = new ArrayList <>(); returnFilters . add ( Filters . eq ( "name" , 1 )); returnFilters . add ( Filters . eq ( "_id" , 0 ))

MongoDB Aggregation using Java API

A very common problem scenario in programming is to get the records or record count by certain fields. For developers familiar with RDBMS, it's like creating a SQL with combination of count function and group by attributes. For MongoDB too, it's very similar. Let's look at the example below fetching no of employees group by department Ids. public Map < Long , Integer > getEmployeeCountMapByDeptId () { Map < Long , Integer > empCountMap = new HashMap <>(); AggregateIterable < Document > iterable = getMongoCollection (). aggregate ( Arrays . asList ( new Document ( "$match" , new Document ( "active" , Boolean . TRUE ) . append ( "region" , "India" )), new Document ( "$group" , new Document ( "_id" , "$" + "deptId" ). append ( "count"