Monday, 9 December 2013

Mongo DB

MongoDB Tutorial

MongoDB is an open-source document database, and leading NoSQL database. MongoDB is written in c++.
This tutorial will give you great understanding on MongoDB concepts needed to create and deploy a highly scalable and performance oriented database.

MongoDB Overview

MongoDB is a cross-platform, document oriented database that provides, high performance, high availability, and easy scalability. MongoDB works on concept of collection and document.

Database

Database is a physical container for collections. Each database gets its own set of files on the file system. A single MongoDB server typically has multiple databases.

Collection

Collection is a group of MongoDB documents. It is the equivalent of an RDBMS table. A collection exists within a single database. Collections do not enforce a schema. Documents within a collection can have different fields. Typically, all documents in a collection are of similar or related purpose.

Document

A document is a set of key-value pairs. Documents have dynamic schema. Dynamic schema means that documents in the same collection do not need to have the same set of fields or structure, and common fields in a collection's documents may hold different types of data.
Below given table shows the relationship of RDBMS terminology with MongoDB
RDBMSMongoDB
DatabaseDatabase
TableCollection
Tuple/RowDocument
columnField
Table JoinEmbedded Documents
Primary KeyPrimary Key (Default key _id provided by mongodb itself)
Database Server and Client
Mysqld/Oraclemongod
mysql/sqlplusmongo

Sample document

Below given example shows the document structure of a blog site which is simply a comma separated key value pair.
{
   _id: ObjectId(7df78ad8902c)
   title: 'MongoDB Overview', 
   description: 'MongoDB is no sql database',
   by: 'only for programmers',
   url: 'http://www.only4programmers.blogspot.com',
   tags: ['mongodb', 'database', 'NoSQL'],
   likes: 100, 
   comments: [	
      {
         user:'user1',
         message: 'My first comment',
         dateCreated: new Date(2011,1,20,2,15),
         like: 0 
      },
      {
         user:'user2',
         message: 'My second comments',
         dateCreated: new Date(2011,1,25,7,45),
         like: 5
      }
   ]
}
_id is a 12 bytes hexadecimal number which assures the uniqueness of every document. You can provide _id while inserting the document. If you didn't provide then MongoDB provide a unique id for every document. These 12 bytes first 4 bytes for the current timestamp, next 3 bytes for machine id, next 2 bytes for process id of mongodb server and remaining 3 bytes are simple incremental value.

MongoDB Advantages

Any relational database has a typical schema design that shows number of tables and the relationship between these tables. While in MongoDB there is no concept of relationship

Advantages of MongoDB over RDBMS

  • Schema less : MongoDB is document database in which one collection holds different different documents. Number of fields, content and size of the document can be differ from one document to another.
  • Structure of a single object is clear
  • No complex joins
  • Deep query-ability. MongoDB supports dynamic queries on documents using a document-based query language that's nearly as powerful as SQL
  • Tuning
  • Ease of scale-out: MongoDB is easy to scale
  • Conversion / mapping of application objects to database objects not needed
  • Uses internal memory for storing the (windowed) working set, enabling faster access of data

Why should use MongoDB

  • Document Oriented Storage : Data is stored in the form of JSON style documents
  • Index on any attribute
  • Replication & High Availability
  • Auto-Sharding
  • Rich Queries
  • Fast In-Place Updates
  • Professional Support By MongoDB

Where should use MongoDB?

  • Big Data
  • Content Management and Delivery
  • Mobile and Social Infrastructure
  • User Data Management
  • Data Hub

MongoDB Environment

Install MongoDB On Windows

To install the MongoDB on windows, first doownload the latest release of MongoDB fromhttp://www.mongodb.org/downloads Make sure you get correct version of MongoDB depending upon your windows version. To get your windows version open command prompt and execute following command
C:\>wmic os get osarchitecture OSArchitecture 64-bit C:\>
32-bit versions of MongoDB only support databases smaller than 2GB and suitable only for testing and evaluation purposes.
Now extract your downloaded file to c:\ drive or any other location. Make sure name of the extracted folder is mongodb-win32-i386-[version] or mongodb-win32-x86_64-[version]. Here [version] is the version of MongoDB download.
Now open command prompt and run the following command
C:\>move mongodb-win64-* mongodb
      1 dir(s) moved.
C:\>
In case you have extracted the mondodb at different location, then go to that path by using command cd FOOLDER/DIR and now run the above given process.
MongoDB requires a data folder to store its files. The default location for the MongoDB data directory is c:\data\db. So you need to create this folder using the Command Prompt. Execute the following command sequence
C:\>md data
C:\md data\db
If you have install the MongoDB at different location, then you need to specify any alternate path for\data\db by setting the path dbpath in mongod.exe. For the same issue following commands
In command prompt navigate to the bin directory present into the mongodb installation folder. Suppose my installation folder is D:\set up\mongodb
 
C:\Users\XYZ>d:
D:\>cd "set up"
D:\set up>cd mongodb
D:\set up\mongodb>cd bin
D:\set up\mongodb\bin>mongod.exe --dbpath "d:\set up\mongodb\data" 
This will show waiting for connections message on the console output indicates that the mongod.exe process is running successfully.
Now to run the mongodb you need to open another command prompt and issue the following command
 
D:\set up\mongodb\bin>mongo.exe
MongoDB shell version: 2.4.6
connecting to: test
>db.test.save( { a: 1 } )
>db.test.find()
{ "_id" : ObjectId(5879b0f65a56a454), "a" : 1 }
>
This will show that mongodb is installed and run successfully. Next time when you run mongodb you need to issue only commands
 
D:\set up\mongodb\bin>mongod.exe --dbpath "d:\set up\mongodb\data" 
D:\set up\mongodb\bin>mongo.exe

Install MongoDB on Ubuntu

Run the following command to import the MongoDB public GPG Key:
sudo apt-key adv --keyserver hkp://keyserver.ubuntu.com:80 --recv 7F0CEB10
Create a /etc/apt/sources.list.d/mongodb.list file using the following command.
echo 'deb http://downloads-distro.mongodb.org/repo/ubuntu-upstart dist 10gen' | sudo tee /etc/apt/sources.list.d/mongodb.list
Now issue the following command to update the repository:
sudo apt-get update
Now install the MongoDB by using following command:
apt-get install mongodb-10gen=2.2.3
In the above installation 2.2.3 is currently released mongodb version. Make sure to install latest version always. Now mongodb is installed successfully.
Start MongoDB
sudo service mongodb start
Stop MongoDB
sudo service mongodb stop
Restart MongoDB
sudo service mongodb restart
To use mongodb run the following command
mongo
This will connect you to running mongod instance.

MongoDB Help

To get list of commands type db.help() in mongodb client. This will give you list of commands as follows:
DB Help

MongoDB Statistics

To get stats about mongodb server type the command db.stats() in mongodb client. This will show the database name, cumber of collection and documents in the database. Output the command is shown below:
DB Stats

MongoDB Data Modelling

Data in MongoDB has a flexible schema.documents in the same collection do not need to have the same set of fields or structure, and common fields in a collection’s documents may hold different types of data.

Some considerations while designing schema in MongoDB

  • Design your schema according to user requirements.
  • Combine objects into one document if you will use them together. Otherwise separate them (but make sure there should not be need of joins).
  • Duplicate the data (but limited) because disk space is cheap as compare to compute time.
  • Do joins while write, not on read.
  • Optimize your schema for most frequent use cases.
  • Do complex aggregation in the schema

Example

Suppose a client needs a database design for his blog website and see the differences between RDBMS and MongoDB schema design. Website has the following requirements.
  • Every post has the unique title, description and url.
  • Every post can have one or more tags.
  • Every post has the name of its publisher and total number of likes.
  • Every Post have comments given by users along with their name, message, data-time and likes.
  • On each post there can be zero or more comments.
In RDBMS schema design for above requirements will have minimum three tables.
RDBMS Schema Design
While in MongoDB schema design will have one collection post and has the following structure:
{
   _id: POST_ID
   title: TITLE_OF_POST, 
   description: POST_DESCRIPTION,
   by: POST_BY,
   url: URL_OF_POST,
   tags: [TAG1, TAG2, TAG3],
   likes: TOTAL_LIKES, 
   comments: [	
      {
         user:'COMMENT_BY',
         message: TEXT,
         dateCreated: DATE_TIME,
         like: LIKES 
      },
      {
         user:'COMMENT_BY',
         message: TEXT,
         dateCreated: DATE_TIME,
         like: LIKES
      }
   ]
}
So while showing the data, in RDBMS you need to join three tables and in mongodb data will be shown from one collection only.

MongoDB Create Database

The use Command

MongoDB use DATABASE_NAME is used to create database. The command will create a new database, if it doesn't exist otherwise it will return the existing database.

SYNTAX:

Basic syntax of use DATABASE statement is as follows:
use DATABASE_NAME

EXAMPLE:

If you want to create a database with name <mydb>, then use DATABASE statement would be as follows:
>use mydb
switched to db mydb
To check your currently selected database use the command db
>db
mydb
If you want to check your databases list, then use the command show dbs.
>show dbs
local     0.78125GB
test      0.23012GB
Your created database (mydb) is not present in list. To display database you need to insert atleast one document into it.
>db.movie.insert({"name":"only4programmers"})
>show dbs
local      0.78125GB
mydb       0.23012GB
test       0.23012GB
In mongodb default database is test. If you didn't create any database then collections will be stored in test database.

MongoDB Drop Database

The dropDatabase() Method

MongoDB db.dropDatabase() command is used to drop a existing database.

SYNTAX:

Basic syntax of dropDatabase() command is as follows:
db.dropDatabase()
This will delete the selected database. If you have not selected any database, then it will delete default 'test' database

EXAMPLE:

First, check the list available databases by using the command show dbs
>show dbs
local      0.78125GB
mydb       0.23012GB
test       0.23012GB
>
If you want to delete new database <mydb>, then dropDatabase() command would be as follows:
>use mydb
switched to db mydb
>db.dropDatabase()
>{ "dropped" : "mydb", "ok" : 1 }
>
Now check list of databases
>show dbs
local      0.78125GB
test       0.23012GB
>

MongoDB Create Collection

The createCollection() Method

MongoDB db.createCollection(name, options) is used to create collection.

SYNTAX:

Basic syntax of createCollection() command is as follows
db.createCollection(name, options)
In the command, name is name of collection to be created. Options is a document and used to specify configuration of collection
ParameterTypeDescription
NameStringName of the collection to be created
OptionsDocument(Optional) Specify options about memory size and indexing
Options parameter is optional, so you need to specify only name of the collection. Following is the list of options you can use:
FieldTypeDescription
cappedBoolean(Optional) If true, enables a capped collection. Capped collection is a collection fixed size collecction that automatically overwrites its oldest entries when it reaches its maximum size. If you specify true, you need to specify size parameter also.
autoIndexIDBoolean(Optional) If true, automatically create index on _id field.s Default value is false.
sizenumber(Optional) Specifies a maximum size in bytes for a capped collection. If If capped is true, then you need to specify this field also.
maxnumber(Optional) Specifies the maximum number of documents allowed in the capped collection.
While inserting the document, MongoDB first checks size field of capped collection, then it checks max field.

EXAMPLES:

Basic syntax of createCollection() method without options is as follows
>use test
switched to db test
>db.createCollection("mycollection")
{ "ok" : 1 }
>
You can check the created collection by using the command show collections
>show collections
mycollection
system.indexes
Following example shows the syntax of createCollection() method with few important options:
>db.createCollection("mycol", { capped : true, autoIndexID : true, size : 6142800, max : 10000 } )
{ "ok" : 1 }
>
In mongodb you don't need to create collection. MongoDB creates collection automatically, when you insert some document.
>db.only4programmers.insert({"name" : "only4programmers"})
>show collections
mycol
mycollection
system.indexes
only4programmers
>

MongoDB Drop Collection

The drop() Method

MongoDB's db.collection.drop() is used to drop a collection from the database.

SYNTAX:

Basic syntax of drop() command is as follows
db.COLLECTION_NAME.drop()

EXAMPLE:

First, check the available collections into your database mydb
>use mydb
switched to db mydb
>show collections
mycol
mycollection
system.indexes
only4programmers
>
Now drop the collection with the name mycollection
>db.mycollection.drop()
true
>
Again check the list of collections into database
>show collections
mycol
system.indexes
only4programmers
>
drop() method will return true, if the selected collection is dropped successfully otherwise it will return false.

MongoDB Datatypes

MongoDB supports many datatypes whose list is given below:
  • String : This is most commonly used datatype to store the data. String in mongodb must be UTF-8 valid.
  • Integer : This type is used to store a numerical value. Integer can be 32 bit or 64 bit depending upon your server.
  • Boolean : This type is used to store a boolean (true/ false) value.
  • Double : This type is used to store floating point values.
  • Min/ Max keys : This type is used to compare a value against the lowest and highest BSON elements.
  • Arrays : This type is used to store arrays or list or multiple values into one key.
  • Timestamp : ctimestamp. This can be handy for recording when a document has been modified or added.
  • Object : This datatype is used for embedded documents.
  • Null : This type is used to store a Null value.
  • Symbol : This datatype is used identically to a string however, it's generally reserved for languages that use a specific symbol type.
  • Date : This datatype is used to store the current date or time in UNIX time format. You can specify your own date time by creating object of Date and passing day, month, year into it.
  • Object ID : This datatype is used to store the document’s ID.
  • Binary data : This datatype is used to store binay data.
  • Code : This datatype is used to store javascript code into document.
  • Regular expression : This datatype is used to store regular expression

MongoDB - Insert Document

The insert() Method

To insert data into MongoDB collection, you need to use MongoDB's insert() or save()method.

SYNTAX

Basic syntax of insert() command is as follows:
>db.COLLECTION_NAME.insert(document)

EXAMPLE

>db.mycol.insert({
   _id: ObjectId(7df78ad8902c),
   title: 'MongoDB Overview', 
   description: 'MongoDB is no sql database',
   by: 'only for programmers',
   url: 'http://www.only4programmers.blogspot.com',
   tags: ['mongodb', 'database', 'NoSQL'],
   likes: 100
})
Here mycol is our collection name, as created in previous tutorial. If the collection doesn't exist in the database, then MongoDB will create this collection and then insert document into it.
In the inserted document if we don't specify the _id parameter, then MongoDB assigns an unique ObjectId for this document.
_id is 12 bytes hexadecimal number unique for every document in a collection. 12 bytes are divided as follows:
_id: ObjectId(4 bytes timestamp, 3 bytes machine id, 2 bytes process id, 3 bytes incrementer)
To insert multiple documents in single query, you can pass an array of documents in insert() command.

EXAMPLE

>db.post.insert([
{
   title: 'MongoDB Overview', 
   description: 'MongoDB is no sql database',
   by: 'only for programmers',
   url: 'http://www.only4programmers.blogspot.com',
   tags: ['mongodb', 'database', 'NoSQL'],
   likes: 100
},
{
   title: 'NoSQL Database', 
   description: 'NoSQL database doesn't have tables',
   by: 'only for programmers',
   url: 'http://www.only4programmers.blogspot.com',
   tags: ['mongodb', 'database', 'NoSQL'],
   likes: 20, 
   comments: [	
      {
         user:'user1',
         message: 'My first comment',
         dateCreated: new Date(2013,11,10,2,35),
         like: 0 
      }
   ]
}
])
To insert the document you can use db.post.save(document) also. If you don't specify _id in the document then save() method will work same as insert() method. If you specify _id then it will replace whole data of document containing _id as specified in save() method.

MongoDB - Query Document

The find() Method

To query data from MongoDB collection, you need to use MongoDB's find() method.

SYNTAX

Basic syntax of find() method is as follows
>db.COLLECTION_NAME.find()
find() method will display all the documents in a non structured way.

The pretty() Method

To display the results in a formatted way, you can use pretty() method.

SYNTAX:

>db.mycol.find().pretty()

Example

>db.mycol.find().pretty()
{
   "_id": ObjectId(7df78ad8902c),
   "title": "MongoDB Overview", 
   "description": "MongoDB is no sql database",
   "by": "only for programmers",
   "url": "http://www.only4porgrammers.blogspot.com",
   "tags": ["mongodb", "database", "NoSQL"],
   "likes": "100"
}
>
Apart from find() method there is findOne() method, that reruns only one document.

RDBMS Where Clause Equivalents in MongoDB

To query the document on the basis of some condition, you can use following operations
OperationSyntaxExampleRDBMS Equivalent
Equality{<key>:<value>}db.mycol.find({"by":"only for programmers"}).pretty()where by = 'only for programmers'
Less Than{<key>:{$lt:<value>}}db.mycol.find({"likes":{$lt:50}}).pretty()where likes < 50
Less Than Equals{<key>:{$lte:<value>}}db.mycol.find({"likes":{$lte:50}}).pretty()where likes <= 50
Greater Than{<key>:{$gt:<value>}}db.mycol.find({"likes":{$gt:50}}).pretty()where likes > 50
Greater Than Equals{<key>:{$gte:<value>}}db.mycol.find({"likes":{$gte:50}}).pretty()where likes >= 50
Not Equals{<key>:{$ne:<value>}}db.mycol.find({"likes":{$ne:50}}).pretty()where likes != 50

AND in MongoDB

SYNTAX:

In the find() method if you pass multiple keys by separating them by ',' then MongoDB treats it ANDcondition. Basic syntax of AND is shown below:
>db.mycol.find({key1:value1, key2:value2}).pretty()

EXAMPLE

Below given example will show all the tutorials written by 'only for programmers' and whose title is 'MongoDB Overview'
>db.mycol.find({"by":"only for programmers","title": "MongoDB Overview"}).pretty()
{
   "_id": ObjectId(7df78ad8902c),
   "title": "MongoDB Overview", 
   "description": "MongoDB is no sql database",
   "by": "only for programmers",
   "url": "http://www.only4programmers.blogspot.com",
   "tags": ["mongodb", "database", "NoSQL"],
   "likes": "100"
}
>
For the above given example equivalent where clause will be ' where by='only for programmers' AND title='MongoDB Overview' '. You can pass any number of key, value pairs in find clause.

OR in MongoDB

SYNTAX:

To query documents based on the OR condition, you need to use $or keyword. Basic syntax of OR is shown below:
>db.mycol.find(
   {
      $or: [
	     {key1: value1}, {key2:value2}
      ]
   }
).pretty()

EXAMPLE

Below given example will show all the tutorials written by 'only for programmers' or whose title is 'MongoDB Overview'
>db.mycol.find({$or:[{"by":"only for programmers"},{"title": "MongoDB Overview"}]}).pretty()
{
   "_id": ObjectId(7df78ad8902c),
   "title": "MongoDB Overview", 
   "description": "MongoDB is no sql database",
   "by": "only for programmers",
   "url": "http://www.only4programmers.blogspot.com",
   "tags": ["mongodb", "database", "NoSQL"],
   "likes": "100"
}
>

Using AND and OR together

EXAMPLE

Below given example will show the documents that have likes greater than 100 and whose title is either 'MongoDB Overview' or by is 'only for programmers'. Equivalent sql where clause is 'where likes>10 AND (by = 'only for programmers' OR title = 'MongoDB Overview')'
>db.mycol.find("likes": {$gt:10}, $or: [{"by": "only for programmers"}, {"title": "MongoDB Overview"}] }).pretty()
{
   "_id": ObjectId(7df78ad8902c),
   "title": "MongoDB Overview", 
   "description": "MongoDB is no sql database",
   "by": "only for programmers",
   "url": "http://www.only4programmers.blogspot.com",
   "tags": ["mongodb", "database", "NoSQL"],
   "likes": "100"
}
>

MongoDB Update Document

MongoDB's update() and save() methods are used to update document into a collection. The update() method update values in the existing document while the save() method replaces the existing document with the document passed in save() method.

MongoDB Update() method

The update() method updates values in the existing document.

SYNTAX:

Basic syntax of update() method is as follows
>db.COLLECTION_NAME.update(SELECTIOIN_CRITERIA, UPDATED_DATA)

EXAMPLE

Consider the mycol collectioin has following data.
{ "_id" : ObjectId(5983548781331adf45ec5), "title":"MongoDB Overview"}
{ "_id" : ObjectId(5983548781331adf45ec6), "title":"NoSQL Overview"}
{ "_id" : ObjectId(5983548781331adf45ec7), "title":"only for programmers Overview"}
Following example will set the new title 'New MongoDB Tutorial' of the documents whose title is 'MongoDB Overview'
>db.mycol.update({'title':'MongoDB Overview'},{$set:{'title':'New MongoDB Tutorial'}})
>db.mycol.find()
{ "_id" : ObjectId(5983548781331adf45ec5), "title":"New MongoDB Tutorial"}
{ "_id" : ObjectId(5983548781331adf45ec6), "title":"NoSQL Overview"}
{ "_id" : ObjectId(5983548781331adf45ec7), "title":"only for programmers Overview"}
>
By default mongodb will update only single document, to update multiple you need to set a paramter 'multi' to true.
>db.mycol.update({'title':'MongoDB Overview'},{$set:{'title':'New MongoDB Tutorial'}},{multi:true})

MongoDB Save() Method

The save() method replaces the existing document with the new document passed in save() method

SYNTAX

Basic syntax of mongodb save() method is shown below:
>db.COLLECTION_NAME.save({_id:ObjectId(),NEW_DATA})

EXAMPLE

Following example will replace the document with the _id '5983548781331adf45ec7'
>db.mycol.save(
   {
      "_id" : ObjectId(5983548781331adf45ec7), "title":"only for programmers New Topic", "by":"only for programmers"
   }
)
>db.mycol.find()
{ "_id" : ObjectId(5983548781331adf45ec5), "title":"only for programmers  New Topic", "by":"only for programmers"}
{ "_id" : ObjectId(5983548781331adf45ec6), "title":"NoSQL Overview"}
{ "_id" : ObjectId(5983548781331adf45ec7), "title":"only for programmers Overview"}
>

MongoDB Delete Document

The remove() Method

MongoDB's remove() method is used to remove document from the collection. remove() method accepts two parameters. One is deletion criteria and second is justOne flag
  1. deletion criteria : (Optional) deletion criteria according to documents will be removed.
  2. justOne : (Optional) if set to true or 1, then remove only one document.

SYNTAX:

Basic syntax of remove() method is as follows
>db.COLLECTION_NAME.remove(DELLETION_CRITTERIA)

EXAMPLE

Consider the mycol collectioin has following data.
{ "_id" : ObjectId(5983548781331adf45ec5), "title":"MongoDB Overview"}
{ "_id" : ObjectId(5983548781331adf45ec6), "title":"NoSQL Overview"}
{ "_id" : ObjectId(5983548781331adf45ec7), "title":"only for programmers Overview"}
Following example will remove all the documents whose title is 'MongoDB Overview'
>db.mycol.remove({'title':'MongoDB Overview'})
>db.mycol.find()
{ "_id" : ObjectId(5983548781331adf45ec6), "title":"NoSQL Overview"}
{ "_id" : ObjectId(5983548781331adf45ec7), "title":"only for programmers Overview"}
>

Remove only one

If there are multiple records and you want to delete only first record, then set justOne parameter inremove() method
>db.COLLECTION_NAME.remove(DELETION_CRITERIA,1)

Remove All documents

If you don't specify deletion criteria, then mongodb will delete whole documents from the collection.This is equivalent of SQL's truncate command.
>db.mycol.remove()
>db.mycol.find()
>

MongoDB Projection

In mongodb projection meaning is selecting only necessary data rather than selecting whole of the data of a document. If a document has 5 fields and you need to show only 3, then select only 3 fields from them.

The find() Method

MongoDB's find() method, explained in MongoDB Query Document accepts second optional parameter that is list of fields that you want to retrieve. In MongoDB when you execute find() method, then it displays all fields of a document. To limit this you need to set list of fields with value 1 or 0. 1 is used to show the filed while 0 is used to hide the field.

SYNTAX:

Basic syntax of find() method with projection is as follows
>db.COLLECTION_NAME.find({},{KEY:1})

EXAMPLE

Consider the collection myycol has the following data
{ "_id" : ObjectId(5983548781331adf45ec5), "title":"MongoDB Overview"}
{ "_id" : ObjectId(5983548781331adf45ec6), "title":"NoSQL Overview"}
{ "_id" : ObjectId(5983548781331adf45ec7), "title":"only for programmers Overview"}
Following example will display the title of the document while quering the document.
>db.mycol.find({},{"title":1,_id:0})
{"title":"MongoDB Overview"}
{"title":"NoSQL Overview"}
{"title":"only for programmers Overview"}
>
Please note _id field is always displayed while executing find() method, if you don't want this field, then you need to set it as 0

MongoDB Limit Records

The Limit() Method

To limit the records in MongoDB, you need to use limit() method. limit() method accepts one number type argument, which is number of documents that you want to displayed.

SYNTAX:

Basic syntax of limit() method is as follows
>db.COLLECTION_NAME.find().limit(NUMBER)

EXAMPLE

Consider the collection myycol has the following data
{ "_id" : ObjectId(5983548781331adf45ec5), "title":"MongoDB Overview"}
{ "_id" : ObjectId(5983548781331adf45ec6), "title":"NoSQL Overview"}
{ "_id" : ObjectId(5983548781331adf45ec7), "title":"only for programmers Overview"}
Following example will display only 2 documents while quering the document.
>db.mycol.find({},{"title":1,_id:0}).limit(2)
{"title":"MongoDB Overview"}
{"title":"NoSQL Overview"}
>
If you don't specify number argument in limit() method then it will display all documents from the collection.

MongoDB Skip() Method

Apart from limit() method there is one more method skip() which also accepts number type argument and used to skip number of documents.

SYNTAX:

Basic syntax of skip() method is as follows
>db.COLLECTION_NAME.find().limit(NUMBER).skip(NUMBER)

EXAMPLE:

Following example will only display only second document.
>db.mycol.find({},{"title":1,_id:0}).limit(1).skip(1)
{"title":"NoSQL Overview"}
>
Please note default value in skip() method is 0

MongoDB Sort Documents

The sort() Method

To sort documents in MongoDB, you need to use sort() method. sort() method accepts a document containing list of fields along with their sorting order. To specify sorting order 1 and -1 are used. 1 is used for ascending order while -1 is used for descending order.

SYNTAX:

Basic syntax of sort() method is as follows
>db.COLLECTION_NAME.find().sort({KEY:1})

EXAMPLE

Consider the collection myycol has the following data
{ "_id" : ObjectId(5983548781331adf45ec5), "title":"MongoDB Overview"}
{ "_id" : ObjectId(5983548781331adf45ec6), "title":"NoSQL Overview"}
{ "_id" : ObjectId(5983548781331adf45ec7), "title":"only for programmers Overview"}
Following example will display the documents sorted by title in descending order.
>db.mycol.find({},{"title":1,_id:0}).sort({"title":-1})
{"title":"only for programmers Overview"}
{"title":"NoSQL Overview"}
{"title":"MongoDB Overview"}
>
Please note if you don't specify the sorting preference, then sort() method will display documents in ascending order.

MongoDB Indexing

Indexes support the efficient resolution of queries. Without indexes, MongoDB must scan every document of a collection to select those documents that match the query statement. This scan is highly inefficient and require the mongod to process a large volume of data.
Indexes are special data structures, that store a small portion of the data set in an easy to traverse form. The index stores the value of a specific field or set of fields, ordered by the value of the field as specified in index.

The ensureIndex() Method

To create an index you need to use ensureIndex() method of mongodb.

SYNTAX:

Basic syntax of ensureIndex() method is as follows()
>db.COLLECTION_NAME.ensureIndex({KEY:1})
Here key is the name of filed on which you want to create index and 1 is for ascending order. To create index in descending order you need to use -1.

EXAMPLE

>db.mycol.ensureIndex({"title":1})
>
In ensureIndex() method you can pass multiple fields, to create index on multiple fields.
>db.mycol.ensureIndex({"title":1,"description":-1})
>
ensureIndex() method also accepts list of options (which are optional), whose list is given below:
ParameterTypeDescription
backgroundBooleanBuilds the index in the background so that building an index does not block other database activities. Specify true to build in the background. The default value is false.
uniqueBooleanCreates a unique index so that the collection will not accept insertion of documents where the index key or keys match an existing value in the index. Specify true to create a unique index. The default value is false.
namestringThe name of the index. If unspecified, MongoDB generates an index name by concatenating the names of the indexed fields and the sort order.
dropDupsBooleanCreates a unique index on a field that may have duplicates. MongoDB indexes only the first occurrence of a key and removes all documents from the collection that contain subsequent occurrences of that key. Specify true to create unique index. The default value is false.
sparseBooleanIf true, the index only references documents with the specified field. These indexes use less space but behave differently in some situations (particularly sorts). The default value is false.
expireAfterSecondsintegerSpecifies a value, in seconds, as a TTL to control how long MongoDB retains documents in this collection.
vindex versionThe index version number. The default index version depends on the version of mongod running when creating the index.
weightsdocumentThe weight is a number ranging from 1 to 99,999 and denotes the significance of the field relative to the other indexed fields in terms of the score.
default_languagestringFor a text index, the language that determines the list of stop words and the rules for the stemmer and tokenizer. The default value isenglish.
language_overridestringFor a text index, specify the name of the field in the document that contains, the language to override the default language. The default value is language.

MongoDB Aggregation

Aggregations operations process data records and return computed results. Aggregation operations group values from multiple documents together, and can perform a variety of operations on the grouped data to return a single result. In sql count(*) and with group by is an equivalent of mongodb aggregation.

The aggregate() Method

For the aggregation in mongodb you should use aggregate() method.

SYNTAX:

Basic syntax of aggregate() method is as follows
>db.COLLECTION_NAME.aggregate(AGGREGATE_OPERATION)

EXAMPLE:

In the collection you have the following data:
{
   _id: ObjectId(7df78ad8902c)
   title: 'MongoDB Overview', 
   description: 'MongoDB is no sql database',
   by_user: 'only for programmers',
   url: 'http://www.only4programmers.blogspot.com',
   tags: ['mongodb', 'database', 'NoSQL'],
   likes: 100
},
{
   _id: ObjectId(7df78ad8902d)
   title: 'NoSQL Overview', 
   description: 'No sql database is very fast',
   by_user: 'only for programmers',
   url: 'http://www.only4programmers.blogspot.com',
   tags: ['mongodb', 'database', 'NoSQL'],
   likes: 10
},
{
   _id: ObjectId(7df78ad8902e)
   title: 'Neo4j Overview', 
   description: 'Neo4j is no sql database',
   by_user: 'Neo4j',
   url: 'http://www.neo4j.com',
   tags: ['neo4j', 'database', 'NoSQL'],
   likes: 750
},
Now from the above collection if you want to display a list that how many tutorials are written by each user then you will use aggregate() method as shown below:
> db.mycol.aggregate([{$group : {_id : "$by_user", num_tutorial : {$sum : 1}}}])
{
   "result" : [
      {
         "_id" : "only for programmers",
         "num_tutorial" : 2
      },
      {
         "_id" : "only for programmers",
         "num_tutorial" : 1
      }
   ],
   "ok" : 1
}
>
Sql equivalent query for the above use case will be select by_user, count(*) from mycol group by by_user
In the above example we have grouped documents by field by_user and on each occurance of by_user previous value of sum is incremented. There is a list available aggregation expressions.
ExpressionDescriptionExample
$sumSums up the defined value from all documents in the collection.db.mycol.aggregate([{$group : {_id : "$by_user", num_tutorial : {$sum : "$likes"}}}])
$avgCalculates the average of all given values from all documents in the collection.db.mycol.aggregate([{$group : {_id : "$by_user", num_tutorial : {$avg : "$likes"}}}])
$minGets the minimum of the corresponding values from all documents in the collection.db.mycol.aggregate([{$group : {_id : "$by_user", num_tutorial : {$min : "$likes"}}}])
$maxGets the maximum of the corresponding values from all documents in the collection.db.mycol.aggregate([{$group : {_id : "$by_user", num_tutorial : {$max : "$likes"}}}])
$pushInserts the value to an array in the resulting document.db.mycol.aggregate([{$group : {_id : "$by_user", url : {$push: "$url"}}}])
$addToSetInserts the value to an array in the resulting document but does not create duplicates.db.mycol.aggregate([{$group : {_id : "$by_user", url : {$addToSet : "$url"}}}])
$firstGets the first document from the source documents according to the grouping. Typically this makes only sense together with some previously applied “$sort”-stage.db.mycol.aggregate([{$group : {_id : "$by_user", first_url : {$first : "$url"}}}])
$lastGets the last document from the source documents according to the grouping. Typically this makes only sense together with some previously applied “$sort”-stage.db.mycol.aggregate([{$group : {_id : "$by_user", last_url : {$last : "$url"}}}])

Pipeline Concept

In UNIX command shell pipeline means the possibility to execute an operation on some input and use the output as the input for the next command and so on. MongoDB also support same concept in aggregation framework. There is a set of possible stages and each of those is taken a set of documents as an input and is producing a resulting set of documents (or the final resulting JSON document at the end of the pipeline). This can then in turn again be used for the next stage an so on.
Possible stages in aggregation framework are following:
  • $project: Used to select some specific fields from a collection.
  • $match: This is a filtering operation and thus this can reduce the amount of documents that are given as input to the next stage.
  • $group: This does the actual aggregation as discussed above.
  • $sort: Sorts the documents.
  • $skip: With this it is possible to skip forward in the list of documents for a given amount of documents.
  • $limit: This limits the amount of documents to look at by the given number starting from the current position.s
  • $unwind: This is used to unwind document that are using arrays. when using an array the data is kind of pre-joinded and this operation will be undone with this to have individual documents again. Thus with this stage we will increase the amount of documents for the next stage.

MongoDB Replication

Replication is the process of synchronizing data across multiple servers. Replication provides redundancy and increases data availability with multiple copies of data on different database servers, replication protects a database from the loss of a single server. Replication also allows you to recover from hardware failure and service interruptions. With additional copies of the data, you can dedicate one to disaster recovery, reporting, or backup.

Why Replication?

  • To keep your data safe
  • High (24*7) availability of data
  • Disaster Recovery
  • No downtime for maintenance (like backups, index rebuilds, compaction)
  • Read scaling (extra copies to read from)
  • Replica set is transparent to the application

How replication works in MongoDB

MongoDB achieves replication by the use of replica set. A replica set is a group of mongod instances that host the same data set. In a replica one node is primary node that receives all write operations. All other instances, secondaries, apply operations from the primary so that they have the same data set. Replica set can have only one primary node.
  1. Replica set is a group of two or more nodes (generally minimum 3 nodes are required).
  2. In a replica set one node is primary node and remaining nodes are secondary.
  3. All data replicates from primary to secondary node.
  4. At the time of automatic failover or maintenance, election establishes for primary and a new primary node is elected.
  5. After the recovery of failed node, it again join the replica set and works as a secondary node.
A typical diagram of mongodb replication is shown in which client application always interact with primary node and primary node then replicate the data to the secondary nodes.
MongoDB Replication

Replica set features

  • A cluster of N nodess
  • Anyone node can be primary
  • All write operations goes to primary
  • Automatic failover
  • Automatic Recovery
  • Consensus election of primary

Set up a replica set

In this tutorial we will convert standalone mongod instance to a replica set. To convert to replica set follow the below given steps:
  • Shutdown already running mongodb server.
Now start the mongodb server by specifying --replSet option. Basic syntax of --replSet is given below:
mongod --port "PORT" --dbpath "YOUR_DB_DATA_PATH" --replSet "REPLICA_SET_INSTANCE_NAME"

EXAMPLE

mongod --port 27017 --dbpath "D:\set up\mongodb\data" --replSet rs0
It will start a mongod instance with the name rs0, on port 27017. Now start the command prompt and connect to this mongod instance. In mongo client issue the command rs.initiate() to initiate a new replica set. To check the replica set configuration issue the command rs.conf(). To check the status of replica sete issue the command rs.status().

Add members to replica set

To add members to replica set, start mongod instances on multiple machines. Now start a mongo client and issue a command rs.add().

SYNTTAX:

Basic syntax of rs.add() command is as follows:
>rs.add(HOST_NAME:PORT)

EXAMPLE

Suppose your mongod instance name is mongod1.net and it is running on port 27017. To add this instance to replica set issue the command rs.add() in mongo client.
>rs.add("mongod1.net:27017")
>
You can add mongod instance to replica set only when you are connected to primary node. To check whether you are connected to primary or not issue the command db.isMaster() in mongo client.

MongoDB Sharding

Sharding

Sharding is the process of storing data records across multiple machines and it is MongoDB's approach to meeting the demands of data growth. As the size of the data increases, a single machine may not be sufficient to store the data nor provide an acceptable read and write throughput. Sharding solves the problem with horizontal scaling. With sharding, you add more machines to support data growth and the demands of read and write operations.

Why Sharding?

  • In replication all writes go to master node
  • Latency sensitive queries still go to master
  • Single replica set has limitation of 12 nodes
  • Memory can't be large enough when active dataset is big
  • Local Disk is not big enough
  • Vertical scaling is too expensive

Sharding in MongoDB

Below given diagram shows the sharding in MongoDB using sharded cluster.
MongoDB Sharding
In the above given diagram there are three main components which are described below:
  • Shards: Shards are used to store data. They provide high availability and data consistency. In production environment each shard is a separate replica set.
  • Config Servers: Config servers store the cluster's metadata. This data contains a mapping of the cluster's data set to the shards. The query router uses this metadata to target operations to specific shards. In production environment sharded clusters have exactly 3 config servers.
  • Query Routers: Query Routers are basically mongos instances, interface with client applications and direct operations to the appropriate shard. The query router processes and targets operations to shards and then returns results to the clients. A sharded cluster can contain more than one query router to divide the client request load. A client sends requests to one query router. Generally a sharded cluster have many query routers.

MongoDB Create Backup

Dump MongoDB Data

To create backup of database in mongodb you should use mongodump command. This command will dump all data of your server into dump directory. There are many options available by which you can limit the amount of data or create backup of your remote server.

SYNTAX:

Basic syntax of mongodump command is as follows
>mongodump

EXAMPLE

Start your mongod server. Assuming that your mongod server is running on localhost and port 27017. Now open a command prompt and go to bin directory of your mongodb instance and type the command mongodump
Consider the mycol collectioin has following data.
>mongodump
The command will connect to the server running at 127.0.0.1 and port 27017 and back all data of the server to directory /bin/dump/. Output of the command is shown below:
DB Stats
There are a list of available options that can be used with the mongodump command.
This command will backup only specified database at specified path
SyntaxDescriptionExample
mongodump --host HOST_NAME --port PORT_NUMBERThis commmand will backup all databases of specified mongod instance.mongodump --host only4programmers.blogspot.com --port 27017
mongodump --dbpath DB_PATH --out BACKUP_DIRECTORYmongodump --dbpath /data/db/ --out /data/backup/
mongodump --collection COLLECTION --db DB_NAMEThis command will backup only specified collection of specified database.mongodump --collection mycol --db test

Restore data

To restore backup data mongodb's mongorerstore command is used. This command restore all of the data from the back up directory.

SYNTAX

Basic syntax of mongorestore command is
>mongorestore
Output of the command is shown below:
DB Stats

MongoDB Deployment

When you are preparing a MongoDB deployment, you should try to understand how your application is going to hold up in production. It’s a good idea to develop a consistent, repeatable approach to managing your deployment environment so that you can minimize any surprises once you’re in production.
The best approach incorporates prototyping your set up, conducting load testing, monitoring key metrics, and using that information to scale your set up. The key part of the approach is to proactively monitor your entire system - this will help you understand how your production system will hold up before deploying, and determine where you will need to add capacity. Having insight into potential spikes in your memory usage, for example, could help put out a write-lock fire before it starts.
To monitor your deployment MongoDB provides some commands that are shown below:

mongostat

This command checks the status of all running mongod instances and return counters of database operations. These counters include inserts, queries, updates, deletes, and cursors. Command also shows when you’re hitting page faults, and showcase your lock percentage. This means that you're running low on memory, hitting write capacity or have some performance issue.
To run the command start your mongod instance. In another command prompt go to bin directory of your mongodb installation and type mongostat.
D:\set up\mongodb\bin>mongostat
Output of the command is shown below:
mongostat

mongotop

This command track and report the read and write activity of MongoDB instance on a collection basis. By default mongotop returns information in each second, by you can change it accordingly. You should check that this read and write activity matches your application intention, and you’re not firing too many writes to the database at a time, reading too frequently from disk, or are exceeding your working set size.
To run the command start your mongod instance. In another command prompt go to bin directory of your mongodb installation and type mongotop.
D:\set up\mongodb\bin>mongotop
Output of the command is shown below:
mongotop
To change mongotop command to return information less frequently specify a specific number after the mongotop command.
D:\set up\mongodb\bin>mongotop 30
The above example will return values every 30 seconds.
Apart from the mongodb tools, 10gen provides a free, hosted monitoring service MongoDB Management Service (MMS), that provides a dashboard and gives you a view of the metrics from your entire cluster.

MongoDB Java

Installation

Before we start using MongoDB in our Java programs, we need to make sure that we have MongoDB JDBC Driver and Java set up on the machine. You can check Java tutorial for Java installation on your machine. Now, let us check how to set up MongoDB JDBC driver.
  • You need to download the jar from the path Download mongo.jar. Make sure to download latest release of it.
  • You need to include the mongo.jar into your classpath.

Connect to database

To connect database, you need to specify database name, if database doesn't exist then mongodb creates it automatically.
Code snippets to connect to database would be as follows:
import com.mongodb.MongoClient;
import com.mongodb.MongoException;
import com.mongodb.WriteConcern;
import com.mongodb.DB;
import com.mongodb.DBCollection;
import com.mongodb.BasicDBObject;
import com.mongodb.DBObject;
import com.mongodb.DBCursor;
import com.mongodb.ServerAddress;
import java.util.Arrays;

public class MongoDBJDBC{
   public static void main( String args[] ){
      try{   
		 // To connect to mongodb server
         MongoClient mongoClient = new MongoClient( "localhost" , 27017 );
         // Now connect to your databases
         DB db = mongoClient.getDB( "test" );
		 System.out.println("Connect to database successfully");
         boolean auth = db.authenticate(myUserName, myPassword);
		 System.out.println("Authentication: "+auth);
      }catch(Exception e){
	     System.err.println( e.getClass().getName() + ": " + e.getMessage() );
	  }
   }
}
Now, let's compile and run above program to create our database test. You can change your path as per your requirement. We are assuming current version of JDBC driver mongo-2.10.1.jar is available in the current path
$javac MongoDBJDBC.java
$java -classpath ".:mongo-2.10.1.jar" MongoDBJDBC
Connect to database successfully
Authentication: true
If you are going to use Windows machine, then you can compile and run your code as follows:
$javac MongoDBJDBC.java
$java -classpath ".;mongo-2.10.1.jar" MongoDBJDBC
Connect to database successfully
Authentication: true
Value of auth will be true, if the user name and password are valid for the selected database.

Create a collection

To create a collection, createCollection() method of com.mongodb.DB class is used.
Code snippets to create a collection:
import com.mongodb.MongoClient;
import com.mongodb.MongoException;
import com.mongodb.WriteConcern;
import com.mongodb.DB;
import com.mongodb.DBCollection;
import com.mongodb.BasicDBObject;
import com.mongodb.DBObject;
import com.mongodb.DBCursor;
import com.mongodb.ServerAddress;
import java.util.Arrays;

public class MongoDBJDBC{
   public static void main( String args[] ){
      try{   
	 // To connect to mongodb server
         MongoClient mongoClient = new MongoClient( "localhost" , 27017 );
         // Now connect to your databases
         DB db = mongoClient.getDB( "test" );
	 System.out.println("Connect to database successfully");
         boolean auth = db.authenticate(myUserName, myPassword);
	 System.out.println("Authentication: "+auth);
         DBCollection coll = db.createCollection("mycol");
         System.out.println("Collection created successfully");
      }catch(Exception e){
	     System.err.println( e.getClass().getName() + ": " + e.getMessage() );
	  }
   }
}
When program is compiled and executed, it will produce the following result:
Connect to database successfully
Authentication: true
Collection created successfully

Getting/ selecting a collection

To get/select a collection from the database, getCollection() method of com.mongodb.DBCollectionclass is used.
Code snippets to get/select a collection:
import com.mongodb.MongoClient;
import com.mongodb.MongoException;
import com.mongodb.WriteConcern;
import com.mongodb.DB;
import com.mongodb.DBCollection;
import com.mongodb.BasicDBObject;
import com.mongodb.DBObject;
import com.mongodb.DBCursor;
import com.mongodb.ServerAddress;
import java.util.Arrays;

public class MongoDBJDBC{
   public static void main( String args[] ){
      try{   
	 // To connect to mongodb server
         MongoClient mongoClient = new MongoClient( "localhost" , 27017 );
         // Now connect to your databases
         DB db = mongoClient.getDB( "test" );
	 System.out.println("Connect to database successfully");
         boolean auth = db.authenticate(myUserName, myPassword);
	 System.out.println("Authentication: "+auth);
         DBCollection coll = db.createCollection("mycol");
         System.out.println("Collection created successfully");
         DBCollection coll = db.getCollection("mycol");
         System.out.println("Collection mycol selected successfully");
      }catch(Exception e){
	     System.err.println( e.getClass().getName() + ": " + e.getMessage() );
	  }
   }
}
When program is compiled and executed, it will produce the following result:
Connect to database successfully
Authentication: true
Collection created successfully
Collection mycol selected successfully

Insert a document

To insert a document into mongodb, insert() method of com.mongodb.DBCollection class is used.
Code snippets to insert a documents:
import com.mongodb.MongoClient;
import com.mongodb.MongoException;
import com.mongodb.WriteConcern;
import com.mongodb.DB;
import com.mongodb.DBCollection;
import com.mongodb.BasicDBObject;
import com.mongodb.DBObject;
import com.mongodb.DBCursor;
import com.mongodb.ServerAddress;
import java.util.Arrays;

public class MongoDBJDBC{
   public static void main( String args[] ){
      try{   
	 // To connect to mongodb server
         MongoClient mongoClient = new MongoClient( "localhost" , 27017 );
         // Now connect to your databases
         DB db = mongoClient.getDB( "test" );
	 System.out.println("Connect to database successfully");
         boolean auth = db.authenticate(myUserName, myPassword);
	 System.out.println("Authentication: "+auth);         
         DBCollection coll = db.getCollection("mycol");
         System.out.println("Collection mycol selected successfully");
         BasicDBObject doc = new BasicDBObject("title", "MongoDB").
            append("description", "database").
            append("likes", 100).
            append("url", "http://www.only4porgrammers.blogspot.com/mongodb/").
            append("by", "only for programmers");
         coll.insert(doc);
         System.out.println("Document inserted successfully");
      }catch(Exception e){
	     System.err.println( e.getClass().getName() + ": " + e.getMessage() );
	  }
   }
}
When program is compiled and executed, it will produce the following result:
Connect to database successfully
Authentication: true
Collection mycol selected successfully
Document inserted successfully

Retrieve all documents

To select all documents from the collection, find() method of com.mongodb.DBCollection class is used. This method returns a cursor, so you need to iterate this cursor.
Code snippets to select all documents:
import com.mongodb.MongoClient;
import com.mongodb.MongoException;
import com.mongodb.WriteConcern;
import com.mongodb.DB;
import com.mongodb.DBCollection;
import com.mongodb.BasicDBObject;
import com.mongodb.DBObject;
import com.mongodb.DBCursor;
import com.mongodb.ServerAddress;
import java.util.Arrays;

public class MongoDBJDBC{
   public static void main( String args[] ){
      try{   
	 // To connect to mongodb server
         MongoClient mongoClient = new MongoClient( "localhost" , 27017 );
         // Now connect to your databases
         DB db = mongoClient.getDB( "test" );
	 System.out.println("Connect to database successfully");
         boolean auth = db.authenticate(myUserName, myPassword);
	 System.out.println("Authentication: "+auth);         
         DBCollection coll = db.getCollection("mycol");
         System.out.println("Collection mycol selected successfully");
         DBCursor cursor = coll.find();
         int i=1;
         while (cursor.hasNext()) { 
            System.out.println("Inserted Document: "+i); 
            System.out.println(cursor.next()); 
            i++;
         }
      }catch(Exception e){
	     System.err.println( e.getClass().getName() + ": " + e.getMessage() );
	  }
   }
}
When program is compiled and executed, it will produce the following result:
Connect to database successfully
Authentication: true
Collection mycol selected successfully
Inserted Document: 1
{
   "_id" : ObjectId(7df78ad8902c),
   "title": "MongoDB",
   "description": "database",
   "likes": 100,
   "url": "http://www.only4programmers.blogspot.com/mongodb/",
   "by": "only for programmers"
}

Update document

To update document from the collection, update() method of com.mongodb.DBCollection class is used.
Code snippets to select first document:
import com.mongodb.MongoClient;
import com.mongodb.MongoException;
import com.mongodb.WriteConcern;
import com.mongodb.DB;
import com.mongodb.DBCollection;
import com.mongodb.BasicDBObject;
import com.mongodb.DBObject;
import com.mongodb.DBCursor;
import com.mongodb.ServerAddress;
import java.util.Arrays;

public class MongoDBJDBC{
   public static void main( String args[] ){
      try{   
	 // To connect to mongodb server
         MongoClient mongoClient = new MongoClient( "localhost" , 27017 );
         // Now connect to your databases
         DB db = mongoClient.getDB( "test" );
	 System.out.println("Connect to database successfully");
         boolean auth = db.authenticate(myUserName, myPassword);
	 System.out.println("Authentication: "+auth);         
         DBCollection coll = db.getCollection("mycol");
         System.out.println("Collection mycol selected successfully");
         DBCursor cursor = coll.find();
         while (cursor.hasNext()) { 
            DBObject updateDocument = cursor.next();
            updateDocument.put("likes","200")
            col1.update(updateDocument); 
         }
         System.out.println("Document updated successfully");
         cursor = coll.find();
         int i=1;
         while (cursor.hasNext()) { 
            System.out.println("Updated Document: "+i); 
            System.out.println(cursor.next()); 
            i++;
         }
      }catch(Exception e){
	     System.err.println( e.getClass().getName() + ": " + e.getMessage() );
	  }
   }
}
When program is compiled and executed, it will produce the following result:
Connect to database successfully
Authentication: true
Collection mycol selected successfully
Document updated successfully
Updated Document: 1
{
   "_id" : ObjectId(7df78ad8902c),
   "title": "MongoDB",
   "description": "database",
   "likes": 100,
   "url": "http://www.only4programmers.blogspot.com/mongodb/",
   "by": "only for programmers"
}

Delete first document

To delete first document from the collection, you need to first select the documents using findOne()method and then remove method of com.mongodb.DBCollection class.
Code snippets to delete first document:
import com.mongodb.MongoClient;
import com.mongodb.MongoException;
import com.mongodb.WriteConcern;
import com.mongodb.DB;
import com.mongodb.DBCollection;
import com.mongodb.BasicDBObject;
import com.mongodb.DBObject;
import com.mongodb.DBCursor;
import com.mongodb.ServerAddress;
import java.util.Arrays;

public class MongoDBJDBC{
   public static void main( String args[] ){
      try{   
	 // To connect to mongodb server
         MongoClient mongoClient = new MongoClient( "localhost" , 27017 );
         // Now connect to your databases
         DB db = mongoClient.getDB( "test" );
	 System.out.println("Connect to database successfully");
         boolean auth = db.authenticate(myUserName, myPassword);
	 System.out.println("Authentication: "+auth);         
         DBCollection coll = db.getCollection("mycol");
         System.out.println("Collection mycol selected successfully");
         DBObject myDoc = coll.findOne();
         col1.remove(myDoc);
         DBCursor cursor = coll.find();
         int i=1;
         while (cursor.hasNext()) { 
            System.out.println("Inserted Document: "+i); 
            System.out.println(cursor.next()); 
            i++;
         }
         System.out.println("Document deleted successfully");
      }catch(Exception e){
	     System.err.println( e.getClass().getName() + ": " + e.getMessage() );
	  }
   }
}
When program is compiled and executed, it will produce the following result:
Connect to database successfully
Authentication: true
Collection mycol selected successfully
Document deleted successfully
Remaining mongodb methods save(), limit(), skip(), sort() etc works same as explained in subsequent tutorial.

MongoDB PHP

To use mongodb with php you need to use mongodb php driver. Download the driver from the urlDownload PHP Driver. Make sure to download latest release of it. Now unzip the archive and put php_mongo.dll in your PHP extension directory ("ext" by default) and add the following line to your php.ini file:
extension=php_mongo.dll

Make a connection and Select a database

To make a connection, you need to specify database name, if database doesn't exist then mongodb creates it automatically.
Code snippets to connect to database would be as follows:
<?php
   // connect to mongodb
   $m = new MongoClient();
   echo "Connection to database successfully";
   // select a database
   $db = $m->mydb;
   echo "Database mydb selected";
?>
When program is executed, it will produce the following result:
Connection to database successfully
Database mydb selected

Create a collection

Code snippets to create a collection would be as follows:
<?php
   // connect to mongodb
   $m = new MongoClient();
   echo "Connection to database successfully";
   // select a database
   $db = $m->mydb;
   echo "Database mydb selected";
   $collection = $db->createCollection("mycol");
   echo "Collection created succsessfully";
?>
When program is executed, it will produce the following result:
Connection to database successfully
Database mydb selected
Collection created succsessfully

Insert a document

To insert a document into mongodb, insert() method is used.
Code snippets to insert a documents:
<?php
   // connect to mongodb
   $m = new MongoClient();
   echo "Connection to database successfully";
   // select a database
   $db = $m->mydb;
   echo "Database mydb selected";
   $collection = $db->mycol;
   echo "Collection selected succsessfully";
   $document = array( 
      "title" => "MongoDB", 
      "description" => "database", 
      "likes" => 100,
      "url" => "http://www.only4programmers.blogspot.com/mongodb/",
      "by", "only for programmers"
   );
   $collection->insert($document);
   echo "Document inserted successfully";
?>
When program is executed, it will produce the following result:
Connection to database successfully
Database mydb selected
Collection selected succsessfully
Document inserted successfully

Find all documents

To select all documents from the collection, find() method is used.
Code snippets to select all documents:
<?php
   // connect to mongodb
   $m = new MongoClient();
   echo "Connection to database successfully";
   // select a database
   $db = $m->mydb;
   echo "Database mydb selected";
   $collection = $db->mycol;
   echo "Collection selected succsessfully";

   $cursor = $collection->find();
   // iterate cursor to display title of documents
   foreach ($cursor as $document) {
      echo $document["title"] . "\n";
   }
?>
When program is executed, it will produce the following result:
Connection to database successfully
Database mydb selected
Collection selected succsessfully
{
   "title": "MongoDB"
}

Update a document

To update a document , you need to use update() method.
In the below given example we will update the title of inserted document to MongoDB Tutorial. Code snippets to update a document:
<?php
   // connect to mongodb
   $m = new MongoClient();
   echo "Connection to database successfully";
   // select a database
   $db = $m->mydb;
   echo "Database mydb selected";
   $collection = $db->mycol;
   echo "Collection selected succsessfully";

   // now update the document
   $collection->update(array("title"=>"MongoDB"), array('$set'=>array("title"=>"MongoDB Tutorial")));
   echo "Document updated successfully";
   // now display the updated document
   $cursor = $collection->find();
   // iterate cursor to display title of documents
   echo "Updated document";
   foreach ($cursor as $document) {
      echo $document["title"] . "\n";
   }
?>
When program is executed, it will produce the following result:
Connection to database successfully
Database mydb selected
Collection selected succsessfully
Document updated successfully
Updated document
{
   "title": "MongoDB Tutorial"
}

Delete a document

To delete a document , you need to use remove() method.
In the below given example we will remove the documents that has title MongoDB Tutorial. Code snippets to delete document:
<?php
   // connect to mongodb
   $m = new MongoClient();
   echo "Connection to database successfully";
   // select a database
   $db = $m->mydb;
   echo "Database mydb selected";
   $collection = $db->mycol;
   echo "Collection selected succsessfully";
   
   // now remove the document
   $collection->remove(array("title"=>"MongoDB Tutorial"),false);
   echo "Documents deleted successfully";
   
   // now display the available documents
   $cursor = $collection->find();
   // iterate cursor to display title of documents
   echo "Updated document";
   foreach ($cursor as $document) {
      echo $document["title"] . "\n";
   }
?>
When program is executed, it will produce the following result:
Connection to database successfully
Database mydb selected
Collection selected succsessfully
Documents deleted successfully
In the above given example second parameter is boolean type and used for justOne field of remove()method.
Remaining mongodb methods findOne(), save(), limit(), skip(), sort() etc works same as explained in above tutorial.


Sunday, 8 December 2013

IPv6

IPv6 Tutorial

Internet Protocol version 6 (IPv6) is the latest revision of the Internet Protocol (IP) and the first version of the protocol to be widely deployed. IPv6 was developed by the Internet Engineering Task Force (IETF) to deal with the long-anticipated problem of IPv4 address exhaustion.
This tutorial will help you in understanding IPv6 and associated terminologies along with appropriate references and examples.

IPv6 - Overview

Internet Protocol version 6, is a new addressing protocol designed to incorporate whole sort of requirement of future internet known to us as Internet version 2. This protocol as its predecessor IPv4, works on Network Layer (Layer-3). Along with its offering of enormous amount of logical address space, this protocol has ample of features which addresses today’s shortcoming of IPv4.

Why new IP version?

So far, IPv4 has proven itself as a robust routable addressing protocol and has served human being for decades on its best-effort-delivery mechanism. It was designed in early 80’s and did not get any major change afterward. At the time of its birth, Internet was limited only to a few Universities for their research and to Department of Defense. IPv4 is 32 bits long which offers around 4,294,967,296 (232) addresses. This address space was considered more than enough that time. Given below are major points which played key role in birth of IPv6:
  • Internet has grown exponentially and the address space allowed by IPv4 is saturating. There is a requirement of protocol which can satisfy the need of future Internet addresses which are expected to grow in an unexpected manner.
  • Using features such as NAT, has made the Internet discontiguous i.e. one part which belongs to intranet, primarily uses private IP addresses; which has to go through number of mechanism to reach the other part, the Internet, which is on public IP addresses.
  • IPv4 on its own does not provide any security feature which is vulnerable as data on Internet, which is a public domain, is never safe. Data has to be encrypted with some other security application before being sent on Internet.
  • Data prioritization in IPv4 is not up to date. Though IPv4 has few bits reserved for Type of Service or Quality of Service, but they do not provide much functionality.
  • IPv4 enabled clients can be configured manually or they need some address configuration mechanism. There exists no technique which can configure a device to have globally unique IP address.

Why not IPv5?

Till date, Internet Protocol has been recognized has IPv4 only. Version 0 to 3 were used while the protocol was itself under development and experimental process. So, we can assume lots of background activities remain active before putting a protocol into production. Similarly, protocol version 5 was used while experimenting with stream protocol for internet. It is known to us as Internet Stream Protocol which used Internet Protocol number 5 to encapsulate its datagram. Though it was never brought into public use, but it was already used.
Here is a table of IP version and their use:
IPv6 Version Table

Brief History

After IPv4’s development in early 80s, the available IPv4 address pool begun to shrink rapidly as the demand of addresses exponentially increased with Internet. Taking pre-cognizance of situation that might arise IETF, in 1994, initiated the development of an addressing protocol to replace IPv4. The progress of IPv6 can be tracked by means of RFC published:
  • 1998 – RFC 2460 – Basic Protocol
  • 2003 – RFC 2553 – Basic Socket API
  • 2003 – RFC 3315 – DHCPv6
  • 2004 – RFC 3775 – Mobile IPv6
  • 2004 – RFC 3697 – Flow Label Specification
  • 2006 – RFC 4291 – Address architecture (revision)
  • 2006 – RFC 4294 – Node requirement
June 06, 2012 some of Internet giants chose to put their Servers on IPv6. Presently they are using Dual Stack mechanism to implement IPv6 parallel with IPv4.

IPv6 - Features

The successor of IPv4 is not designed to be backward compatible. Trying to keep the basic functionalities of IP addressing, IPv6 is redesigned entirely. It offers the following features:
  • Larger Address Space:
    In contrast to IPv4, IPv6 uses 4 times more bits to address a device on the Internet. This much of extra bits can provide approximately 3.4×1038 different combinations of addresses. This address can accumulate the aggressive requirement of address allotment for almost everything in this world. According to an estimate, 1564 addresses can be allocated to every square meter of this earth.
  • Simplified Header:
    IPv6’s header has been simplified by moving all unnecessary information and options (which are present in IPv4 header) to the end of the IPv6 header. IPv6 header is only twice as bigger than IPv4 providing the fact the IPv6 address is four times longer.
  • End-to-end Connectivity:
    Every system now has unique IP address and can traverse through the internet without using NAT or other translating components. After IPv6 is fully implemented, every host can directly reach other host on the Internet, with some limitations involved like Firewall, Organization’s policies, etc.
  • Auto-configuration:
    IPv6 supports both stateful and stateless auto configuration mode of its host devices. This way absence of a DHCP server does not put halt on inter segment communication.
  • Faster Forwarding/Routing:
    Simplified header puts all unnecessary information at the end of the header. All information in first part of the header are adequate for a Router to take routing decision thus making routing decision as quickly as looking at the mandatory header.
  • IPSec:
    Initially it was decided for IPv6 to must have IPSec security, making it more secure than IPv4. This feature has now been made optional.
  • No Broadcast:
    Though Ethernet/Token Ring are considered as broadcast network because they support Broadcasting, IPv6 does not have any Broadcast support anymore left with it. It uses multicast to communicate with multiple hosts.
  • Anycast Support:
    This is another characteristic of IPv6. IPv6 has introduced Anycast mode of packet routing. In this mode, multiple interfaces over the Internet are assigned same Anycast IP address. Routers, while routing, sends the packet to the nearest destination.
  • Mobility:
    IPv6 was designed keeping mobility feature in mind. This feature enables hosts (such as mobile phone) to roam around in different geographical area and remain connected with same IP address. IPv6 mobility feature takes advantage of auto IP configuration and Extension headers.
  • Enhanced Priority support:
    Where IPv4 used 6 bits DSCP (Differential Service Code Point) and 2 bits ECN (Explicit Congestion Notification) to provide Quality of Service but it could only be used if the end-to-end devices support it, that is, the source and destination device and underlying network must support it.
    In IPv6, Traffic class and Flow label are used to tell underlying routers how to efficiently process the packet and route it.
  • Smooth Transition:
    Large IP address scheme in IPv6 enables to allocate devices with globally unique IP addresses. This assures that mechanism to save IP addresses such as NAT is not required. So devices can send/receive data between each other, for example VoIP and/or any streaming media can be used much efficiently.
    Other fact is, the header is less loaded so routers can make forwarding decision and forward them as quickly as they arrive.
  • Extensibility:
    One of the major advantage of IPv6 header is that it is extensible to add more information in the option part. IPv4 provides only 40-bytes for options whereas options in IPv6 can be as much as the size of IPv6 packet itself.

IPv6 - Addressing Modes

In computer networking, addressing mode refers to the mechanism how we address a host on the network. IPv6 offers several types of modes by which a single host can be addressed, more than one host can be addressed at once or the host at closest distance can be addressed.

Unicast

In unicast mode of addressing, an IPv6 interface (host) is uniquely identified in a network segment. The IPv6 packet contains both source and destination IP addresses. A host interface is equipped with an IP address which is unique in that network segment. A network switch or router when receives a unicast IP packet, destined to single host, sends out to one of its outgoing interface which connects to that particular host.
[Image: Unicast Messaging]

Multicast

The IPv6 multicast mode is same as that of IPv4. The packet destined to multiple hosts is sent on a special multicast address. All hosts interested in that multicast information, need to join that multicast group first. All interfaces which have joined the group receive the multicast packet and process it, while other hosts not interested in multicast packets ignore the multicast information.
[Image: Multicast Messaging]

Anycast

IPv6 has introduced a new type of addressing, which is called Anycast addressing. In this addressing mode, multiple interfaces (hosts) are assigned same Anycast IP address. When a host wishes to communicate with a host equipped with an Anycast IP address, sends a Unicast message. With the help of complex routing mechanism, that Unicast message is delivered to the host closest to the Sender, in terms of Routing cost.
[Image: Anycast Messaging]
Let’s take an example of TutorialPoints.com Web Servers, located in all continents. Assume that all Web Servers are assigned single IPv6 Anycast IP Address. Now when a user from Europe wants to reach only4programmers.blogspot.com the DNS points to the server which is physically located in Europe itself. If a user from India tries to reach only4programmers.blogspot.com, the DNS will then point to Web Server physically located in Asia only. Nearest or Closest terms are used in terms of Routing Cost.
In the above picture, When a client computer tries to reach a Server, the request is forwarded to the Server with lowest Routing Cost.

IPv6 - Address Types & Formats

Hexadecimal Number System

Before introducing IPv6 Address format, we shall look into Hexadecimal Number System. Hexadecimal is positional number system which uses radix (base) of 16. To represent the values in readable format, this system uses 0-9 symbols to represent values from zero to nine and A-F symbol to represent values from ten to fifteen. Every digit in Hexadecimal can represent values from 0 to 15.
[Image: Conversion Table]

Address Structure

An IPv6 address is made of 128 bits divided into eight 16-bits blocks. Each block is then converted into 4-digit Hexadecimal numbers separated by colon symbol.
For example, the below is 128 bit IPv6 address represented in binary format and divided into eight 16-bits blocks:
0010000000000001 0000000000000000 0011001000110100 1101111111100001 0000000001100011 0000000000000000 0000000000000000 1111111011111011
Each block is then converted into Hexadecimal and separated by ‘:’ symbol:
2001:0000:3238:DFE1:0063:0000:0000:FEFB
Even after converting into Hexadecimal format, IPv6 address remains long. IPv6 provides some rules to shorten the address. These rules are:
Rule:1 Discard leading Zero(es):
In Block 5, 0063, the leading two 0s can be omitted, such as (5th block):
2001:0000:3238:DFE1:63:0000:0000:FEFB
Rule:2 If two of more blocks contains consecutive zeroes, omit them all and replace with double colon sign ::, such as (6th and 7th block):
2001:0000:3238:DFE1:63::FEFB
Consecutive blocks of zeroes can be replaced only once by :: so if there are still blocks of zeroes in the address they can be shrink down to single zero, such as (2nd block):
2001:0:3238:DFE1:63::FEFB

Interface ID

IPv6 has three different type of Unicast Address scheme. The second half of the address (last 64 bits) is always used for Interface ID. MAC address of a system is composed of 48-bits and represented in Hexadecimal. MAC address is considered to be uniquely assigned worldwide. Interface ID takes advantage of this uniqueness of MAC addresses. A host can auto-configure its Interface ID by using IEEE’s Extended Unique Identifier (EUI-64) format. First, a Host divides its own MAC address into two 24-bits halves. Then 16-bit Hex value 0xFFFE is sandwiched into those two halves of MAC address, resulting in 64-bit Interface ID.
[Image: EUI-64 Interface ID]

Global Unicast Address

This address type is equivalent to IPv4’s public address. Global Unicast addresses in IPv6 are globally identifiable and uniquely addressable.
[Image: Global Unicast Address]
Global Routing Prefix: The most significant 48-bits are designated as Global Routing Prefix which is assigned to specific Autonomous System. Three most significant bits of Global Routing Prefix is always set to 001.

Link-Local Address

Auto-configured IPv6 address is known as Link-Local address. This address always starts with FE80. First 16 bits of Link-Local address is always set to 1111 1110 1000 0000 (FE80). Next 48-bits are set to 0, thus:
[Image: Link-Local Address]
Link-Local addresses are used for communication among IPv6 hosts on a link (broadcast segment) only. These addresses are not routable so a Router never forwards these addresses outside the link.

Unique-Local Address

This type of IPv6 address which is though globally unique, but it should be used in local communication. This address has second half of Interface ID and first half is divided among Prefix, Local Bit, Global ID and Subnet ID.
[Image: Unique-Local Address]
Prefix is always set to 1111 110. L bit, which is set to 1 if the address is locally assigned. So far the meaning of L bit to 0 is not defined. Therefore, Unique Local IPv6 address always starts with ‘FD’.

SCOPE OF IPV6 UNICAST ADDRESSES:

[Image: IPv6 Unicast Address Scope]
The scope of Link-local address is limited to the segment. Unique Local Address are though locally global but are not routed over the Internet, limiting their scope to an organization’s boundary. Global Unicast addresses are globally unique and recognizable. They shall make the essence of Internet v2 addressing.

IPv6 - Special Addresses

Version 6 has slightly complex structure of IP address than that of IPv4. IPv6 has reserved few addresses and address notations for special purposes. See the table below:

Special Addresses:

  • As shown in the table above 0:0:0:0:0:0:0:0/128 address does not specify to anything and is said to be an unspecified address. After simplifying, all 0s are compacted to ::/128.
  • In IPv4, address 0.0.0.0 with netmask 0.0.0.0 represents default route. The same concept is also applie to IPv6, address 0:0:0:0:0:0:0:0 with netmask all 0s represents default route. After applying IPv6 simplying rule this address is compressed to ::/0.
  • Loopback addresses in IPv4 are represented by 127.0.0.1 to 127.255.255.255 series. But in IPv6, only 0:0:0:0:0:0:0:1/128 address represents Loopback address. After simplying loopback address, it can be represented as ::1/128.

Reserved Multicast Address for Routing Protocols:

  • The above table shows reserved multicast addresses used by interior routing protocol.
  • All addresses are reserved in similar IPv4 fashion

Reserved Multicast Address for Routers/Node:

  • These addresses helps routers and hosts to speak to available routers and hosts on a segment without being configured with an IPv6 address. Hosts use EUI-64 based auto-configuration to self-configure an IPv6 address and then speaks to available hosts/routers on the segment by means of these addresses.

IPv6 - Headers

The wonder of IPv6 lies in its header. IPv6 address is 4 times larger than IPv4 but the IPv6 header is only 2 times larger than that of IPv4. IPv6 headers have one Fixed Header and zero or more Optional (Extension) Headers. All necessary information which is essential for a router is kept in Fixed Header. Extension Header contains optional information which helps routers to understand how to handle a packet/flow.

Fixed Header

[Image: IPv6 Fixed Header]
IPv6 fixed header is 40 bytes long and contains the following information.
S.N.Field & Description
1
Version (4-bits): This represents the version of Internet Protocol, i.e. 0110.
2
Traffic Class (8-bits): These 8 bits are divided into two parts. Most significant 6 bits are used for Type of Service, which tells the Router what services should be provided to this packet. Least significant 2 bits are used for Explicit Congestion Notification (ECN).
3
Flow Label (20-bits): This label is used to maintain the sequential flow of the packets belonging to a communication. The source labels the sequence which helps the router to identify that this packet belongs to a specific flow of information. This field helps to avoid re-ordering of data packets. It is designed for streaming/real-time media.
4
Payload Length (16-bits): This field is used to tell the routers how much information this packet contains in its payload. Payload is composed of Extension Headers and Upper Layer data. With 16 bits, up to 65535 bytes can be indicated but if Extension Headers contain Hop-by-Hop Extension Header than payload may exceed 65535 bytes and this field is set to 0.
5
Next Header (8-bits): This field is used to indicate either the type of Extension Header, or if Extension Header is not present then it indicates the Upper Layer PDU. The values for the type of Upper Layer PDU is same as IPv4’s.
6
Hop Limit (8-bits): This field is used to stop packet to loop in the network infinitely. This is same as TTL in IPv4. The value of Hop Limit field is decremented by 1 as it passes a link (router/hop). When the field reaches 0 the packet is discarded.
7
Source Address (128-bits): This field indicates the address of originator of the packet.
8
Destination Address (128-bits): This field provides the address of intended recipient of the packet.

Extension Headers

In IPv6, the Fixed Header contains only information which is necessary and avoiding information which is either not required or is rarely used. All such information, is put between the Fixed Header and Upper layer header in the form of Extension Headers. Each Extension Header is identified by a distinct value.
When Extension Headers are used, IPv6 Fixed Header’s Next Header field points to the first Extension Header. If there is one more Extension Header, then first Extension Header’s ‘Next-Header’ field point to the second one, and so on. The last Extension Header’s ‘Next-Header’ field point to Upper Layer Header. Thus all headers from point to the next one in a linked list manner.
If the Next Header field contains value 59, it indicates that there’s no header after this header, not even Upper Layer Header.
The following Extension Headers must be supported as per RFC 2460:
The sequence of Extension Headers should be:
These headers:
  • 1. Should be processed by First and subsequent destinations.
  • 2. Should be processed by Final Destination.
Extension Headers are arranged one after another in a Linked list manner, as depicted in the diagram below:
[Image: Extension Headers Connected Format]

IPv6 - Communication

In IPv4, a host which wants to communicate with some other host on the network, needs first to have an IP address acquired either by means of DHCP or by manual configuration. As soon as a host is equipped with some valid IP address, it is now able to speak to any host on the subnet. To communicate on layer-3, a host also must know the IP address of the other host. Communication on a link, is established by means of hardware embedded MAC Addresses. To know the MAC address of host whose IP address is known, a host sends ARP broadcast and in revert the intended host sends back its MAC address.
In IPv6, there’s no broadcast mechanism. It is not a must for an IPv6 enabled host to obtain IP address from DHCP or manually configured, but it can auto-configure its own IP. Then, how would a host communicates with others on IPv6 enabled network?
ARP has been replaced by ICMPv6 Neighbor Discovery Protocol.

Neighbor Discovery Protocol

A host in IPv6 network is capable of auto-configuring itself with a unique link-local address. As soon as it is equipped with an IPv6 address, it joins a number of multicast groups. All communications related to that segment happens on those multicast addresses only. A host goes through a series of states in IPv6:
  • Neighbor Solicitation: After configuring all IPv6’s either manually, or by DHCP Server or by auto-configuration, the host sends a Neighbor Solicitation message out to FF02::1/16 multicast address for all its IPv6 addresses in order to know that no one else occupies same addresses.
  • DAD (Duplicate Address Detection): When the host does not listen from anything from the segment regarding its Neighbor Solicitation message, it assumes that no duplicate address exists on the segment.
  • Neighbor Advertisement: After assigning the addresses to its interfaces and making them up and running, the host once again sends out a Neighbor Advertisement message telling all other hosts on the segment, that it has assigned those IPv6 addresses to its interfaces.
Once a host is done with the configuration of its IPv6 addresses, it does the following things:
  • Router Solicitation: A host sends a Router Solicitation multicast packet (FF02::2/16) out on its segment to know the presence of any router on this segment. This helps the host to configure the router as its default gateway. If its default gateway router goes down, the host can shift to a new router and makes it the default gateway.
  • Router Advertisement: When a router receives a Router Solicitation message, it responses back to the host advertising its presence on that link.
  • Redirect: This may be the situation where a Router receives a Router Solicitation request but it knows that it is not the best gateway for the host. In this situation, the router sends back a Redirect message telling the host that there is a better ‘next-hop’ router available. Next-hop is where the host will send its data destined to a host which does not belong to the same segment.

IPv6 - Subnetting

In IPv4, addresses were created in classes. Classful IPv4 addresses clearly defines the bits used for network prefixes and the bits used for hosts on that network. To subnet in IPv4 we play with the default classful netmask which allows us to borrow hosts bit to be used as subnet bits. This results in multiple subnets but less hosts per subnet. That is, when we borrow host bit to create a subnet that costs us in lesser bit to be used for host addresses.
IPv6 addresses uses 128 bits to represent an address which includes bits to be used for subnetting. Second half of the address (least significant 64 bits) is always used for Hosts only. Therefore, there is no compromise if we subnet the network.
[Image: IPv6 Subnetting]
16 Bits of subnet is equivalent to IPv4’s Class B Network. Using these subnet bits an organization can have more 65 thousands of subnets which is by far, more than enough.
Thus routing prefix is /64 and host portion is 64 bits. We though, can further subnet the network beyond 16 bits of Subnet ID, borrowing hosts bit but it is recommended that 64 bits should always be used for hosts addresses because auto-configuration requires 64 bits.
IPv6 subnetting works on the same concept as Variable Length Subnet Masking in IPv4.
/48 prefix can be allocated to an organization providing it the benefit of having up to /64 subnet prefixes, which is 65535 sub-networks, each having 264 hosts. A /64 prefix can be assigned to a point-to-point connection where there are only two hosts (or IPv6 enabled devices) on a link.

Transition From IPv4 to IPv6

One problem in transition from IPv4 to IPv6 completely is that IPv6 is not backward compatible. This results in a situation where either a site is on IPv6 or it is not. Unlike an implementation of new technology where the newer one is backward compatible so the older system can still work with the newer without any additional changes.
To overcome this short-coming, there exist few technologies which can be used in slow and smooth transition from IPv4 to IPv6:

Dual Stack Routers

A router can be installed with both IPv4 and IPv6 addresses configured on its interfaces pointing to the network of relevant IP scheme.
[Image: Dual Stack Router]
In above diagram, a Server which is having IPv4 as well as IPv6 address configured for it now can speak with all hosts on IPv4 network and IPv6 network with help of Dual Stack Router. Dual Stack Router, can communicate with both networks and provides a medium for hosts to access Server without changing their respective IP version.

Tunneling

In a scenario where different IP versions exist on intermediate path or transit network, tunneling provides a better solution where user’s data can pass through a non-supported IP version.
[Image: Tunneling]
The above diagram depicts how two remote IPv4 networks can communicate via Tunnel, where the transit network was on IPv6. Vice versa is also possible where transit network is on IPv6 and remote sites which intends to communicate, are on IPv4.

NAT Protocol Translation

This is another important method of transition to IPv6 by means of a NAT-PT (Network Address Translation – Protocol Translation) enabled device. With help of NAT-PT device, actual conversion happens between IPv4 and IPv6 packets and vice versa. See the diagram below:
[Image: NAT - Protocol Translation]
A host with IPv4 address sends a request to IPv6 enabled Server on Internet which does not understand IPv4 address. In this scenario, NAT-PT device can help them communicate. When IPv4 host sends a request packet to IPv6 Server, NAT-PT device/router, strips down the IPv4 packet, removes IPv4 header and adds IPv6 header and passes it through the Internet. When a response from IPv6 Server comes for IPv4 host, the router does vice versa.

IPv6 Mobility

When a host is connected to one link or network, it acquires an IP address and all communication happens using that IP address on that link. As soon as, the same host changes its physical location, that is, moves into some different area / subnet / network / link, its IP address changes accordingly and all communication happening on the host using old IP address, goes down.
IPv6 mobility provides a mechanism which equips a host with an ability to roam around among different links without losing any communication/connection and its IP address.
Multiple entities are involved in this technology:
  • Mobile Node: The device which needs IPv6 mobility.
  • Home Link: This link is configured with the home subnet prefix and this is where the Mobile IPv6 device gets its Home Address.
  • Home Address: This is the address which Mobile Node acquires from Home Link. This is permanent address of Mobile Node. If the Mobile Node remains in the same Home Link, the communication among various entities happens as usual.
  • Home Agent: This is a router which acts as registrar for Mobile Nodes. Home Agent is connected to Home Link and maintains information about all Mobile Nodes, their Home Addresses and their present IP addresses.
  • Foreign Link: Any other Link which is not Mobile Node’s Home Link.
  • Care-of Address: When a Mobile Node attaches to a Foreign Link, it acquires a new IP address of that Foreign Link’s subnet. Home Agent maintains the information of both Home Address and Care-of Address. Multiple Care-of addresses can be assigned to Mobile Node, but at any instance only one Care-of Address has binding with Home Address.
  • Correspondent Node: Any IPv6 enable device which intends to have communication with Mobile Node.

Mobility Operation

When Mobile Node stays in its Home Link, all communications happen on its Home Address. As shown below:
[Image: Mobile Node connected to Home Link]
When Mobile Node leaves its Home Link and is connected to some Foreign Link, the Mobility feature of IPv6 comes into play. After connecting to Foreign Link, Mobile Node acquires an IPv6 address from Foreign Link. This address is called Care-of Address. Mobile Node sends binding request to its Home Agent with the new Care-of Address. Home Agent binds Mobile Node’s Home Address with Care-of Address, establishing a Tunnel between both.
Whenever a Correspondent Node tries to establish connection with Mobile Node (on its Home Address), the Home Agent intercepts the packet and forwards to Mobile Node’s Care-of Address over the Tunnel which was already established.
[Image: Mobile Node connected to Foreign Link]

Route Optimization

When a Correspondent Node initiate communication by sending packets to Mobile Node on Home Address, these packets are tunneled to Mobile Node by Home Agent. In Route Optimization mode, when the Mobile Node receives packet from Correspondent Node, it does not forward replies to Home Agent. Rather it sends its packet directly to Correspondent Node using Home Address as Source Address. This mode is optional and not used by default.

IPv6 Routing

Routing concepts remain same in case of IPv6 but almost all routing protocol have been redefined accordingly. We have seen in Communication in IPv6 segment, how a host speaks to its gateway. Routing is a process to forward routable data choosing best route among several available routes or path to the destination. A router is a device which forwards data which is not explicitly destined to it.
There exists two forms of routing protocols
  • Distance Vector Routing Protocol: A router running distance vector protocol advertises its connected routes and learns new routes from its neighbors. The routing cost to reach a destination is calculated by means of hops between the source and destination. A Router generally relies on its neighbor for best path selection, also known as “routing-by-rumors”. RIP and BGP are Distance Vector Protocols.
  • Link-State Routing Protocol: This protocol acknowledges the state of a Link and advertises to its neighbors. Information about new links is learnt from peer routers. After all the routing information has been converged, Link-State Routing Protocol uses its own algorithm to calculate best path to all available links. OSPF and IS-IS are link state routing protocols and both uses Djikstra’s Shortest Path First algorithm.
Routing protocols can be divided in two categories:
  • Interior Routing Protocol: Protocols in this categories are used within an Autonomous System or organization to distribute routes among all routers inside its boundary. Examples: RIP, OSPF.
  • Exterior Routing Protocol: Whereas an Exterior Routing Protocol distributes routing information between two different Autonomous Systems or organization. Examples: BGP.

Routing protocols

  • RIPng
    RIPng stands for Routing Information Protocol Next Generation. This is an Interior Routing Protocol and is a Distance Vector Protocol. RIPng has been upgraded to support IPv6.
  • OSPFv3
  • Open Shortest Path First version 3 is an Interior Routing Protocol which is modified to support IPv6. This is a Link-State Protocol and uses Djikrasta’s Shortest Path First algorithm to calculate best path to all destinations.
  • BGPv4
    BGP stands for Border Gateway Protocol. It is the only open standard Exterior Gateway Protocol available. BGP is a Distance Vector protocol which takes Autonomous System as calculation metric, instead of number of routers as Hop. BGPv4 is an upgrade of BGP to support IPv6 routing.

Protocols changed to support IPv6:

  • ICMPv6: Internet Control Message Protocol version 6 is an upgraded implementation of ICMP to accommodate IPv6 requirements. This protocol is used for diagnostic functions, error and information message, statistical purposes. ICMPv6’s Neighbor Discovery Protocol replaces ARP and helps discover neighbor and routers on the link.
  • DHCPv6: Dynamic Host Configuration Protocol version 6 is an implementation of DHCP. Though IPv6 enabled hosts do not require any DHCPv6 Server to acquire IP address as they can be auto-configured. Neither do they need DHCPv6 to locate DNS server because DNS can be discovered and configured via ICMPv6 Neighbor Discovery Protocol. Yet DHCPv6 Server can be used to provide these information.
  • DNS: There has been no new version of DNS but it is now equipped with extensions to provide support for querying IPv6 addresses. A new AAAA (quad-A) record has been added to reply IPv6 query messages. Now DNS can reply with both IP versions (4 & 6) without any change in query format.

IPv6 Summary

IPv4 since 1982, has been an undisputed leader of Internet. With IPv4’s address space exhaustion IPv6 is now taking over the control of Internet, which is called Internet2.
IPv4 is widely deployed and migration to IPv6 would not be easy. So far IPv6 could penetrate IPv4’s address space by less than 1%.
The world has celebrated ‘World IPv6 Day’ on June 08, 2011 with a purpose to test IPv6 address over Internet in full. On June 06, 2012 the Internet community officially launched IPv6. This day all ISPs who were offering IPv6 were to enable it on public domain and were to keep it enable. All the device manufacturer also participated to offer IPv6 by-default enabled on devices.
This was a step towards encouraging Internet community to migrate to IPv6.
Organizations are provided plenty of ways to migrate from IPv4 to IPv6. Also organization, willing to test IPv6 before migrating completely can run both IPv4 and IPv6 simultaneously. Networks of different IP versions can communicate and user data can be tunneled to walk to the other side.

Future of IPv6

IPv6 enabled Internet version 2 will replace todays IPv4 enabled Internet. When Internet was launched with IPv4, developed countries like US and Europe took the larger space of IPv4 for deployment of Internet in their respective countries keeping future need in mind. But Internet exploded everywhere reaching and connecting every country of the world increasing the requirement of IPv4 address space. As a result, till this day US and Europe have many IPv4 address space left with them and countries like India and China are bound to address their IP space requirement by means of deployment of IPv6.
Most of the IPv6 deployment is being done outside US, Europe. India and China are moving forward to change their entire space to IPv6. China has announced a five year deployment plan named China Next Generation Internet.
After June 06, 2012 all major ISPs were shifted to IPv6 and rest of them are still moving.
IPv6 provides ample of address space and is designed to expand today’s Internet services. Feature-rich IPv6 enabled Internet version 2 may deliver more than expected.