Reading Data from MongoDB#
Within the data logging notebook, you learned how to write data to a MongoDB database directly from a microcontroller using MongoDBβs data API. In this notebook, you will learn how to read data from a MongoDB database using the official MongoDB Python driver: PyMongo. Note that you will read data from a separate database collection than the one you wrote to in the previous notebook. While it would be more instructive to read the same data that you wrote, to avoid potential misuse, users only have write access to the database collection from the previous notebook. Likewise, in this notebook, users only have read access to a separate database collection with a fixed set of data.
# only install if we are running in colab
import sys
IN_COLAB = 'google.colab' in sys.modules
if IN_COLAB:
%pip install pymongo pandas
First, instantiate the PyMongo client and connect to the database.
from pymongo.mongo_client import MongoClient
# normally, the MongoDB credentials would be kept private
# but for the purposes of this tutorial we will share them
MONGODB_PASSWORD = "HGzZNsQ3vBLKrXXF"
# Connection string obtained via MongoDB Atlas "Connect" button
blinded_connection_string = "mongodb+srv://test-user-find-only:<password>@test-cluster.c5jgpni.mongodb.net/?retryWrites=true&w=majority"
# Replace <password> with the MongoDB password (where again, the connection
# string and password would normally be kept private)
connection_string = blinded_connection_string.replace("<password>", MONGODB_PASSWORD)
# Create a new client and connect to the server
client = MongoClient(connection_string)
# Send a ping to confirm a successful connection
try:
client.admin.command('ping')
print("Pinged your deployment. You successfully connected to MongoDB!")
except Exception as e:
print(e)
Pinged your deployment. You successfully connected to MongoDB!
Next, read all entries from the collection. In this case, there are only three pre-existing entries in the collection (uploaded by the course developers). This will not include your course ID for reasons listed above.
database_name = "test-db"
collection_name = "read-from-me"
db = client[database_name]
collection = db[collection_name]
# get all results
results = list(collection.find({}))
print(results)
[{'_id': ObjectId('65962c0458f0b76b37484b82'), 'course_id': 'happy-panda'}, {'_id': ObjectId('65962c3758f0b76b37484b83'), 'course_id': 'amused-zebra'}, {'_id': ObjectId('65962c4958f0b76b37484b84'), 'course_id': 'sorrowful-hippo'}]
Create a pandas DataFrame from the collection entries.
# NOTE: your course ID will not appear in the results
import pandas as pd
df = pd.DataFrame(results).set_index("_id")
df
course_id | |
---|---|
_id | |
65962c0458f0b76b37484b82 | happy-panda |
65962c3758f0b76b37484b83 | amused-zebra |
65962c4958f0b76b37484b84 | sorrowful-hippo |
Finally, export the DataFrame to a CSV file and print the CSV file contents.
df.to_csv("results.csv")
with open('results.csv', 'r') as file:
print(file.read())
_id,course_id
65962c0458f0b76b37484b82,happy-panda
65962c3758f0b76b37484b83,amused-zebra
65962c4958f0b76b37484b84,sorrowful-hippo
Aside: Uploading data using PyMongo#
In this notebook, we described how to read data via the PyMongo driver. In the the data logging notebook, we uploaded data using the Data API since we were using a microcontroller. However, when PyMongo is available, it is more straightforward and efficient to use PyMongo. To upload a single document, you can use the insert_one
method from the collection
object defined above:
# Define the document to be inserted
document = {"course_id": "fluffy-rabbit"}
# Insert the document into the collection
id = collection.insert_one(document)
To upload multiple documents, you can use the insert_many
method with a list of documents:
# Define the documents to be inserted
documents = [
{"course_id": "course1"},
{"course_id": "course2"},
{"course_id": "course3"}
]
# Insert the documents into the collection
ids = collection.insert_many(documents)
These concepts are covered at https://pymongo.readthedocs.io/en/stable/tutorial.html. This code is formatted as a text cell rather than a code cell since the database credentials from this notebook only allow for read access as described at the beginning (i.e., they would throw an error in this specific notebook).
Now, you can close the connection to MongoDB.
client.close()
Additional Resources#
Once youβve successfully run this notebook, return to the data logging notebook.