Esercizi aggregation su MongoDB - Books

file book.txt

Json del documento

{

"_id" : 9,

"title" : "Griffon in Action",

"isbn" : "1935182234",

"pageCount" : 375,

"publishedDate" : ISODate("2012-06-04T07:00:00Z"),

"thumbnailUrl" : "https://s3.amazonaws.com/AKIAJC5RLADLUMVRPFDQ.book-thumb-images/almiray.jpg",

"shortDescription" : "Griffon....",

"longDescription" : "Although...",

"status" : "PUBLISH",

"authors" : [ "Andres Almiray", "Danno Ferrin", "", "James Shingler"],

"categories" : [ "Java"]

}

Query da fare:

derive the categories of the books written by “Andres Almiray”
derive the average length of the books on “Java”
derive the shortest books in each category
derive the most prolific authors wrt the number of books they have written
rank the authors wrt the number of pages they have written ○ for a book with n authors and p pages, author contribution is estimated as p/n
derive the most referred book categories

Group Accumulator Operators

$sum (Returns a sum of numerical values. Ignores non-numeric values.)

$avg (Returns an average of numerical values. Ignores non-numeric values.)

$first (Returns a value from the first document for each group. Order is only defined if the documents are in a defined order. - Available in $group stage only.)

$last (Returns a value from the last document for each group. Order is only defined if the documents are in a defined order. - Available in $group stage only.)

$max (Returns the highest expression value for each group. - Available in both $group and $project stages)

$min (Returns the lowest expression value for each group. - Available in both $group and $project stages)

$push (Returns an array of expression values for each group. - Available in $group stage only)

$addToSet (Returns an array of unique expression values for each group. Order of the array elements is undefined. - Available in $group stage only.)

$stdDevPop (Returns the population standard deviation of the input values. - Available in both $group and $project stages)

$stdDevSamp (Returns the sample standard deviation of the input values. - Available in both $group and $project stages)

{$sort: { field1: sortOrder, field2 : sortOrder, ... } }

{ $cmp: [ expression1, expression2] } -1 0 oppure +1 in base al valori di comparazione

$addFields { $addFields: { newField: expression, ... } }

$size { $size: expression }

$ne{ $ne: [ expression1, expression2 ] } (diverso)

Limit { $limit: positiveInteger }

Varie
$gt (Maggiore - db.inventory.find( { qty: { $gt: 20 } } ) )

1) derive the categories of the books written by “Andres Almiray”


                    db.books.aggregate([                                                            

                    { $match: { authors: {$in: ["Andres Almiray","$authors"]}}},                    

                    { $group : { _id : "NULL", categories: {$addToSet: "$categories"  }}},          

                    { $project: { _id:0, autore:"Andres Almiray", categories:1 }}                   

                    ])

2) derive the average length of the books on “Java”


                    db.books.aggregate([                                                    

                    { $match: { categories: {$in: ["Java", "$categories"]}}},               

                    { $group: { _id:"NULL", avarage: {$avg: "$pageCount"}}},                

                    { $project: {_id:0, JavaAvarageLenght: "$avarage"}}                     

                    ]).pretty()

3) derive the shortest books in each category [io]


                    db.books.aggregate([                                                                              

                    { $unwind: "$categories" },                                                                       

                    { $sort : { pageCount : -1}},                                                                     

                    { $group: {_id: {categories:"$categories", title:"$title" }, npage:{$push: "$pageCount"}}},       

                    { $unwind: "$npage" },                                                                            

                    { $project: {_id:"NULL", categories:"$_id.categories", title:"$_id.title",npage:"$npage"}},       

                    { $sort : { npage : -1}},                                                                         

                    { $group: {_id:"$categories", book: {$last:"$title"}, npage: {$last: "$npage"}}},                 

                    { $project: {_id:0, categories:"$_id", title:"$book",npage:"$npage"}}                             

                    ]).pretty()

3) derive the shortest books in each category [Giusto - Proff]


                    db.books.aggregate([                                                                                                                                        

                    {$unwind: "$categories"},                                                                                                                                   

                    {$sort: {"pageCount":1}},                                                                                                                                   

                    {$group: {"_id":{"category":"$categories"}, "books":{$addToSet: {"title":"$title","pages":"$pageCount"}}, "shortest":{$first:"$pageCount"}}},               

                    {$unwind: "$books"},                                                                                                                                        

                    {$addFields: {"toBeSelected": {$cmp: ["$books.pages","$shortest"]}}},                                                                                       

                    {$match: {"toBeSelected":0}},                                                                                                                               

                    {$group: {"_id":{"category":"$_id.category"}, "books":{$addToSet: {"title":"$books.title","pages":"$books.pages"}}, "cfr":{$first:"$shortest"}}}            

                    ])

4) derive the most prolific authors wrt the number of books they have written


                    db.books.aggregate([                                    

                    { $unwind: "$authors" },                                

                    { $group:{ _id : "$authors", nbook:{$sum: 1} }},        

                    { $sort: { nbook:-1 }}                                  

                    ]).pretty()

5) rank the authors wrt the number of pages they have written (for a book with n authors and p pages, author contribution is estimated as p/n )


                    db.books.aggregate([                                                          

                    { $addFields: { "nauthors" : { $size: "$authors" }}},                         

                    { $match : { nauthors : {$gt: 0}}},                                           

                    { $addFields: { "authorsPage": {$divide: ["$pageCount", "$nauthors"]}}},      

                    { $unwind:"$authors" },                                                       

                    { $group:{ _id:"$authors", npage: {$sum: "$authorsPage"}}},                   

                    { $sort: { npage:-1 }}                                                        

                    { $project: {_id:0, autor:"$_id", pagewritten:"$npage"}}                    

                    ]).pretty()

6) derive the most referred book categories


                    db.books.aggregate([                                        

                    { $unwind : "$categories"},                                 

                    { $group: { _id:"$categories", t: {$sum: 1}}},              

                    { $sort: {t:-1}},                                           

                    { $limit: 1 },                                              

                    { $project: {_id:0, Category:"$_id", numberBook:"$t"}}      

                    ])