Search This Blog

Loading...

Tuesday, February 24, 2015

[Mongodb] M-102 Final Exam Answer

Final 1
-------
Answer: 9

Final 2
-------
Answer: 2
  Mongo preserves the order of writes in a collection in its consistency model. In this problem, 27003's oplog was effectively a "fork" and to preserve write ordering a rollback was necessary during 27003's recovery phase.

Final 3
-------
3 documents
>  bsondump test.foo.2013-03-11T10-25-48.0.bson
{ "_id" : 7 }
{ "_id" : 8 }
{ "_id" : 9 }
3 objects found

Final 4
--------
z:PRIMARY> cfg = rs.conf()
z:PRIMARY> cfg.members[2].votes = 0
z:PRIMARY> cfg
{
        "_id" : "z",
        "version" : 1,
        "members" : [
                {
                        "_id" : 1,
                        "host" : "localhost:27001"
                },
                {
                        "_id" : 2,
                        "host" : "localhost:27002"
                },
                {
                        "_id" : 3,
                        "host" : "localhost:27003",
                        "votes" : 0
                }
        ]
}
z:PRIMARY> rs.reconfig(cfg)
{ "ok" : 1 }

~% mongo --shell a.js --port 27003
z:SECONDARY> part4()
233
Answer: 233
Final 5
-------
z:PRIMARY> a = {_id : 1,
  author : 'joe',
  title : 'Too big to fail',
  text : 'Sample text',
  tags : [ 'business', 'finance' ],
  when : ISODate("2008-11-03"),
  views : 23002,
  votes : 4,
  voters : ['joe', 'jane', 'bob', 'somesh'],
  comments : [
    { commenter : 'allan',
      comment : 'Well, i dont think so…',
      flagged:false, plus:2 },
    { commenter : 'chuck',
      comment : 'where is',
      flagged:true, plus:5 },
    { commenter : 'norris',
      comment : 'my leg',
      flagged:true, plus:77777777 },

  ]
}

z:PRIMARY> db.postings.insert(a)
One way to assure people vote at most once per posting is to use this form of update:
db.postings.update(
  { _id:… , voters:{$ne:'joe'} },
  { $inc : {votes:1}, $push : {voters:'joe'} } );
Answer: db.postings.find( { "comments.flagged" : true } )
        db.postings.update(
                   { _id:… , voters:{$ne:'joe'} },
                   { $inc : {votes:1}, $push : {voters:'joe'} } );

Final 6
-------
MongoDB supports atomic operations on individual documents.
MongoDB has a data type for Dates and DateTime data.
Answer: 2 & 4
Final 7
-------
MongoDB supports reads from slaves/secondaries that are in remote locations.
Answer: 2

Final 8
-------
mkdir ./data
mkdir ./data/configdb
mongod --configsvr --dbpath ./data/configdb --port 27019
mongorestore ./config_server --host 127.0.0.1:27019

~%/Documents/Mongo/dba/final.8$
~%/Documents/Mongo/dba/final.8$ mongo localhost:27019/config
MongoDB shell version: 2.2.2
connecting to: localhost:27018/config
configsvr>
configsvr> db
config
configsvr> db.chunks.find().sort({_id:1}).next().lastmodEpoch.toString().substr(20,4)
6554
configsvr> ^C
bye
~%/Documents/Mongo/dba/final.8$
> configsvr> db.chunks.find().sort({_id:1}).next().lastmodEpoch.getTimestamp().toString().substr(19,5)
> 07:07
Answer: 07:07

Final 9
-------
mongorestore --oplogReplay s1 --host SPHINX --port 27017
mongorestore --oplogReplay s2 --host SPHINX --port 27017
-- Update shards
>db.shards.update({"_id":"s1"},{$set : {"host":"localhost:27501"}})
>db.shards.update({"_id":"s2"},{$set : {"host":"localhost:27601"}})
configsvr> db.shards.find()
{ "_id" : "s1", "host" : "localhost:27501" }
{ "_id" : "s2", "host" : "localhost:27601" }

-- Run des deux shards
-- FAUX mongod --shardsvr --port 27501 --fork --logpath s1.log --replSet s1
-- FAUX mongod --shardsvr --port 27601 --fork --logpath s2.log --replSet s2
-- Pas de RS à préciser
>mongod --shardsvr --port 27501 --fork --logpath s1.log
> (remove /data/db/mongo.lock si besoin)
>mongod --shardsvr --port 27601 --fork --logpath s2.log
mongos> use snps
switched to db snps
mongos> db.elegans.aggregate([{$match:{N2:"T"}},{$group:{_id:"$N2",n:{$sum:1}}}]).result[0].n
47664

Final 10
--------
mongos> db.elegans.ensureIndex({"N2":1,"mutant":1})
removed == 1 + _files.size()
mongos>
Answer: 2 shards are queried.
        10 documents are scanned.

Friday, February 13, 2015

[Mongodb] M-102 Week 6 Assignments

---- 6.1
-- Stats sur la collections initiale
mongos> db.trades.stats()
{
 "sharded" : false,
 "primary" : "shard0000",
 "ns" : "week6.trades",
 "count" : 1000001,
 "size" : 244000240,
 "avgObjSize" : 243.999996000004,
 "storageSize" : 315006976,
 "numExtents" : 15,
 "nindexes" : 1,
 "lastExtentSize" : 84426752,
 "paddingFactor" : 1,
 "systemFlags" : 1,
 "userFlags" : 0,
 "totalIndexSize" : 32458720,
 "indexSizes" : {
  "_id_" : 32458720
 },
 "ok" : 1
}
-- On retrouve le même nombre de document dans le shard car il est unique
mongos> homework.a()
1000001

---- 6.2
> 3

---- 6.3
> 2 (shards bien presents)

[Mongodb] M-102 Week 5 Assignments

Homework: Homework 5.1

Set up a replica set that includes an arbiter.
To demonstrate that you have done this, what is the text in the "state" field for the arbiter when you run rs.status()?
7



Homework: Homework 5.2

You have a replica set with two members, having decided that a total of two copies of your data is sufficient.
You are concerned about robustness, however.
Which of the following are options that will allow you to ensure that failover can occur if the first server goes down?
Check all that apply.
[yes] Add an arbiter.
Increase the number of votes for the first replica set member from one to two.
Increase the priority of the first server from one to two.
[yes] Add another data bearing node.
 

Homework: Homework 5.3

At the beginning of the section, "W Timeout and Capacity Planning", Dwight mentioned something that hasn't yet been touched upon in this class: Batch Inserts.
He mentions that you can find information on them in the docs, but he hasn't used them in the shell during this course.
Your job is to look this functionality up in the documentation, and use it for the following problem:
Please insert into the m102 collection, the following 3 documents, using only one shell query (please use double quotes around each key in each document in your answer):
  • { "a" : 1 }
  • { "b" : 2 }
  • { "c" : 3 }
Hints: This is not the same as the Bulk() operations, which were discussed earlier in this course. Also, this does not involve semicolons, so don't put any into the text box. You probably want to test this on your machine, and then copy and paste the insert query in the box below. Do not include the > sign from the beginning of the shell.
db.m102.insert([ { "a" : 1 }, { "b" : 2 }, { "c" : 3 } ])

Homework: Homework 5.4

You would like to create a replica set that is robust to data center failure.
You only have two data centers available. Which arrangement(s) of servers will allow you to be stay up (as in, still able to elect a primary) in the event of a failure of either data center (but not both at once)? Check all that apply.
All 3 servers in data center 1.
2servers in data center 1, one server in data center 2.
[yes] None of the above.
 

Homework: Homework 5.5

Consider the following scenario: You have a two member replica set, a primary, and a secondary.
The data center with the primary goes down, and is expected to remain down for the foreseeable future. Your secondary is now the only copy of your data, and it is not accepting writes. You want to reconfigure your replica set config to exclude the primary, and allow your secondary to be elected, but you run into trouble. Find out the optional parameter that you'll need, and input it into the box below for your rs.reconfigure(new_cfg, OPTIONAL PARAMETER).
Use "double quotes" around the key(s) in your document.
Hint: You may want to use this documentation page to solve this problem.
{"force" : true}

[Mongodb] M-102 Week 4 Assignments


# starting as a standalone server for problem 1:
mongod --dbpath 1 --port 27001 --smallfiles --oplogSize 50
mongo --port 27001 --shell week4.js
> homework.init()
> homework.a()

result : 5001

rs.init()
use week4
> db.foo.find()
> homework.b()

answer: 5002

mongo --port 27002 week4.js --shell
> homework.c()

result : 5

rs.help()
> rs.stepDown(300)
cfg = rs.conf()
cfg.members.shift()
rs.reconfig(cfg)
homework.d()

answer: 6

> // expect to get back false as we are not primary
> db.isMaster().ismaster
false
> db.oplog.rs.find()
> db.oplog.rs.stats()
db.oplog.rs.find().sort({$natural:1}).limit(1).next().o.msg[0]

answer: R
 

[Mongodb] M-102 Week 3 Assignments

-- 3.1

db.sensor_readings.find( {   tstamp : {      $gte : ISODate("2012-08-01"),      $lte :  ISODate("2012-09-01")    },   active : true } ).limit(3).explain()
{
 "cursor" : "BasicCursor",
 "isMultiKey" : false,
 "n" : 3,
 "nscannedObjects" : 2696,
 "nscanned" : 2696,
 "nscannedObjectsAllPlans" : 2696,
 "nscannedAllPlans" : 2696,
 "scanAndOrder" : false,
 "indexOnly" : false,
 "nYields" : 0,
 "nChunkSkips" : 0,
 "millis" : 6,
 "indexBounds" : {
 
 },
 "server" : "cmo3233:27017"
}

> db.sensor_readings.ensureIndex({tstamp:1, active:1})
> db.sensor_readings.find( {   tstamp : {      $gte : ISODate("2012-08-01"),      $lte :  ISODate("2012-09-01")    },   active : true } ).limit(3).explain()
{
 "cursor" : "BtreeCursor tstamp_1_active_1",
 "isMultiKey" : false,
 "n" : 3,
 "nscannedObjects" : 3,
 "nscanned" : 5,
 "nscannedObjectsAllPlans" : 7,
 "nscannedAllPlans" : 9,
 "scanAndOrder" : false,
 "indexOnly" : false,
 "nYields" : 0,
 "nChunkSkips" : 0,
 "millis" : 0,
 "indexBounds" : {
  "tstamp" : [
   [
    ISODate("2012-08-01T00:00:00Z"),
    ISODate("2012-09-01T00:00:00Z")
   ]
  ],
  "active" : [
   [
    true,
    true
   ]
  ]
 },
 "server" : "cmo3233:27017"
}
============
> homework.a()
That's good - you can submit the answer below (just submit the number).  however there
is an index you could
create that will get nscanned down to 3 for the query of interest. see if you can
do that or watch the answer video for details on that...
6
============
=> Second index de test
> db.sensor_readings.ensureIndex({active:1, str:1, tstamp:1, x:1})
---- 3.2 (currentOp / killOp)
> homework.c()
> 12
-- 3.3
> db.runCommand({compact : 'sensor_readings'})
> homework.d()
21

[Mongodb] M-102 Week 2 Assignments

-----2.1

> homework.a()
3

---- 2.2

var prod1 = db.products.findOne({_id:ObjectId("507d95d5719dbef170f15c00")})

prod1.term_years = 3

db.products.update({_id:prod1._id},prod1)

prod1.limits.sms.over_rate = 0.01

db.products.update({_id:prod1._id},prod1)

> homework.b()
0.050.019031


---- 2.3

> db.products.find({"limits.voice":{$exists:1}}).count()

3

---- 2.4

> db.products.ensureIndex({for:1})

Q1 : 4
Q2 : 4
Q3 : YES

---- 2.5

// Trouver les items et boucler dessus
> db.products.find({for:{$exists:1},for:"ac3"}).forEach( function(item) { printjson(item) } )

// Réaliser l'update
> db.products.find({for:{$exists:1},for:"ac3"}).forEach( function(item) { db.products.update({_id:item._id},{$inc:{price:2}}) } )

> homework.c()
89.5954.5

Saturday, January 10, 2015

[Teradata] Temporal table example

I wanted to show you an example of the TEMPORAL feature in Teradata. Its a pretty neat feature.

Basically, it will allow us to keep a history of changes where required. For example, the FULL REFRESH dimensions, will change (based on the customers feedback) to a CDC (change data capture). This is normally a pretty complex process.

With the TEMPORAL feature, coupled with UPSERTS (MERGE), it becomes quite simple.


First - TEMPORAL. Basically Teradata under the covers of a table will manage a "PERIOD" data type - instead of a process having to manage the classic START_DT and END_DT logic. In fact, you dont even need to know about the column.

Also - when a user does a select * from the table, they will only get the CURRENT rows returned.

Please see this example to help.

Example FULL REFRESH Table that will move to this process:

CREATE MULTISET TABLE EDW_STAGE.tempo_test_auto_purchase_pack_size ,NO FALLBACK ,
     NO BEFORE JOURNAL,
     NO AFTER JOURNAL,
     CHECKSUM = DEFAULT,
     DEFAULT MERGEBLOCKRATIO
     (
      auto_purchase_pack_size_id INTEGER,
      duration_mins INTEGER,
      pack_size INTEGER,
      display_order INTEGER,
      is_active CHAR(1) CHARACTER SET LATIN NOT CASESPECIFIC,
      is_default CHAR(1) CHARACTER SET LATIN NOT CASESPECIFIC,
      created_dt TIMESTAMP(6),
      created_by_user_id VARCHAR(50) CHARACTER SET LATIN NOT CASESPECIFIC,
      last_update_dt TIMESTAMP(6),
      last_updated_by_user_id VARCHAR(50) CHARACTER SET LATIN NOT CASESPECIFIC,
      is_deleted CHAR(1) CHARACTER SET LATIN NOT CASESPECIFIC,
      source_sys_id SMALLINT,
      company_brand_cd VARCHAR(10) CHARACTER SET LATIN NOT CASESPECIFIC,
       duration PERIOD(TIMESTAMP(6) WITH TIME ZONE) NOT NULL
         AS TRANSACTIONTIME
      
      )
 PRIMARY INDEX ( auto_purchase_pack_size_id );
 
NOTE the column called duration - it will NOT have values inserted...Teradata manages this.
I will do an initial insert of 3 columns and those look like this when we select them:

select * from tempo_test_auto_purchase_pack_size;

 
Note The 2nd row (ID of 3) where the column duration_mins is 40.

We issue an update (or merge) to this table:
update tempo_test_auto_purchase_pack_size set duration_mins = 95 where auto_purchase_pack_size_id = 3;

Now look at the select * from tempo_test_auto_purchase_pack_size;

 
Note that the new value is the LATEST row....BUT if we want to see history:

  TRANSACTIONTIME AS OF TIMESTAMP '2015-01-10 11:24:10' select * from tempo_test_auto_purchase_pack_size;


 
And finally - so see them all:

 NONSEQUENCED TRANSACTIONTIME select * from tempo_test_auto_purchase_pack_size;


 
Note that with this query you see the duration column for this temporal feature.


Hope this helps.