{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# 5. Logging Data\n", "\n", "```{warning}\n", "As a temporary workaround to the issue described in https://github.com/orgs/micropython/discussions/15112, add [`urequests_2.py`](https://github.com/AccelerationConsortium/ac-microcourses/blob/main/docs/courses/hello-world/urequests_2.py) [[permalink](https://github.com/AccelerationConsortium/ac-microcourses/blob/e5541ce3ec307a8e5e0f2b20f000c03f040e1f56/docs/courses/hello-world/urequests_2.py)] to your microcontroller, and change `import urequests` to `import urequests_2 as urequests` in the code below. See https://github.com/orgs/micropython/discussions/15112 and https://github.com/micropython/micropython-lib/pull/861 for ongoing updates. The corresponding assignment will also be affected, but this can be addressed using the same workaround.\n", "```\n", "\n", "In this tutorial, you will learn how to upload data to a MongoDB database directly from a microcontroller and read data from a database using Python.\n", "\n", "\n", "\n", "*MongoDB database interface for a light-mixing database instance.*\n", "\n", "## FAIR Data\n", "\n", "Chemistry and materials research data is precious. By making your data **F**indable, **A**ccessible, **I**nteroperable, and **R**eusable ([FAIR](https://www.go-fair.org/fair-principles/)), you can maximize the impact of your research. Here are two conceptual examples of what FAIR data might look like in the physical sciences:\n", "\n", "![FAIR data in materials science](./images/fair-data-materials-science.png)\n", "\n", "*Defining FAIR Data in materials science. Reproduced from https://doi.org/10.1557/s43577-023-00498-4*" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Writing to MongoDB Using the Data API\n", "\n", "For storing our data, we will be using [MongoDB](https://www.mongodb.com/), a popular [\"NoSQL\"](https://www.mongodb.com/nosql-explained) database. It's important to note that MongoDB is just one of many excellent choices for databases. It's a document-oriented database, which means it stores data in JSON-like documents. MongoDB is a popular choice for internet-of-things (IoT) applications due to its ease of setup, use, and scalability. Additionally, MongoDB offers a [Data API](https://docs.atlas.mongodb.com/data-api/) that allows for direct reading and writing of data from a microcontroller.\n", "\n", "For the purposes of this tutorial, we have set up a free-tier test database through MongoDB Atlas. We provide an API key for test purposes. To prevent potential misuse from distributing a public API key (which is generally not a good practice), we have granted only write permissions, and the database is configured to automatically delete entries once a certain storage threshold is reached.\n", "\n", "✅ Copy the following code into a new file on the microcontroller called `write_mongodb.py` and run the file. Note that you will need [`netman.py`](https://github.com/sparks-baird/self-driving-lab-demo/blob/main/src/public_mqtt_sdl_demo/lib/netman.py) [[permalink](https://github.com/sparks-baird/self-driving-lab-demo/blob/0ff0adec3e997c096990de594844d73a9ce18fd6/src/public_mqtt_sdl_demo/lib/netman.py)] and a file named `my_secrets.py` with your WiFi credentials and course ID." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# based on https://medium.com/@johnlpage/introduction-to-microcontrollers-and-the-pi-pico-w-f7a2d9ad1394\n", "from netman import connectWiFi\n", "import urequests\n", "\n", "from my_secrets import SSID, PASSWORD, COURSE_ID\n", "\n", "connectWiFi(SSID, PASSWORD, country=\"US\")\n", "\n", "DATA_API_KEY = \"UT4cdinBetBaNqCBc5hISkaArhllv5dWfzXgbYsLYzpv79nqNhVwVsudQU5ZUmBE\" # Public API key for demo purposes only\n", "CLUSTER_NAME = \"test-cluster\"\n", "DATABASE_NAME = \"test-db\"\n", "COLLECTION_NAME = \"write-to-me\"\n", "\n", "ENDPOINT_BASE_URL = (\n", " \"https://us-east-2.aws.data.mongodb-api.com/app/data-ibmqs/endpoint/data/v1\"\n", ")\n", "\n", "endpoint_url = f\"{ENDPOINT_BASE_URL}/action/insertOne\"\n", "\n", "headers = {\"api-key\": DATA_API_KEY}\n", "document = {\"course_id\": COURSE_ID}\n", "\n", "payload = {\n", " \"dataSource\": CLUSTER_NAME,\n", " \"database\": DATABASE_NAME,\n", " \"collection\": COLLECTION_NAME,\n", " \"document\": document,\n", "}\n", "\n", "print(f\"sending document to {CLUSTER_NAME}:{DATABASE_NAME}:{COLLECTION_NAME}\")\n", "\n", "num_retries = 3\n", "for _ in range(num_retries):\n", " response = urequests.post(endpoint_url, headers=headers, json=payload)\n", " txt = str(response.text)\n", " status_code = response.status_code\n", "\n", " print(f\"Response: ({status_code}), msg = {txt}\")\n", "\n", " response.close()\n", "\n", " if status_code == 201:\n", " print(\"Added Successfully\")\n", " break\n", "\n", " print(\"Retrying...\")" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The output should look something like the following (MAC address and IP address redacted, and your insertedId will be different):\n", "\n", "```python\n", "MAC address: ***\n", "connected\n", "ip = ***\n", "sending document to test-cluster:test-db:write-to-me\n", "Response: (201), msg = {\"insertedId\":\"6594bfbfb3c925d2fdfbb7e8\"}\n", "Added Successfully\n", "```" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Reading from MongoDB Using PyMongo\n", "\n", "✅ Run the code from [the companion notebook](./1.5.1-pymongo.ipynb) to read data from a MongoDB database.\n", "\n", "" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Reading Material\n", "\n", "✅ Read [Community action on FAIR data will fuel a revolution in materials research](https://doi.org/10.1557/s43577-023-00498-4)\n", "\n", "✅ Watch [the following video](https://youtu.be/cdSENQPsAiI?si=KkwmGzoQmQP-CsQo) about materials databases:\n", "\n", "\n", "\n", "## Additional Resources\n", "- FAIR data principles [website](https://www.go-fair.org/fair-principles/) and [manuscript](https://doi.org/10.1038/sdata.2016.18)\n", "- [MongoDB: What is NoSQL?](https://www.mongodb.com/nosql-explained)\n", "- [MongoDB Getting Started Interactive Tutorial](https://www.mongodb.com/docs/manual/tutorial/getting-started/)\n", "- [Connecting a Raspberry Pi Pico to MongoDB Atlas](https://medium.com/@johnlpage/introduction-to-microcontrollers-and-the-pi-pico-w-f7a2d9ad1394)\n", "- [@sgbaird's list of materials databases](https://github.com/stars/sgbaird/lists/materials-databases)\n", "- Misc. [`awesome-materials-informatics`](https://github.com/tilde-lab/awesome-materials-informatics) (search for \"database\"), [`Materials-Databases` (no longer maintained)](https://github.com/blaiszik/Materials-Databases), [`awesome-chemistry-datasets`](https://github.com/kjappelbaum/awesome-chemistry-datasets)\n", "- [`awesome-self-driving-labs` research data management section](https://github.com/AccelerationConsortium/awesome-self-driving-labs#research-data-management)\n", "\n", "As a side note, MongoDB has a \"serverless\" option (i.e., pay only for what you use) that exceeds the free-tier limits and is more flexible than the shared and dedicated clusters, which may seem appealing at first. However, [costs will escalate quickly if the database is not optimized](https://www.mongodb.com/developer/products/atlas/serverless-instances-billing-optimize-bill-indexing/) (e.g., the database is not indexed). If you decide to go with MongoDB Atlas for a project and need more than the 512 MB of free-tier storage, we recommend first considering the shared (e.g., M2, M5) and dedicated options, hosting your own MongoDB instance, or looking to other ecosystems. Looking to other ecosystems could be either a full replacement or a supplement. For example, let's say that one step in your workflow involves capturing images, where the total storage required is on the order of GBs instead of MBs. Instead of saving your images directly to MongoDB (e.g., using [GridFS or the `BinData` type](https://www.mongodb.com/docs/manual/core/gridfs/)), you can upload your image to [Amazon S3](https://docs.aws.amazon.com/AmazonS3/latest/userguide/Welcome.html) or similar and add the uniform resource identifier (URI) to the corresponding MongoDB document along with other data like sample name, image acquisition parameters, material composition, and processing conditions. The URI can then later be used to [programatically access the file from Amazon S3](https://chat.openai.com/share/4ec06c80-8915-4b21-9b3c-4e7a9abb186a)." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Once you've successfully finished the example and companion notebook from above and completed the reading material, you are done with this tutorial 🎉. Return to the course website to do a knowledge check.\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n" ] } ], "metadata": { "language_info": { "name": "python" }, "nbsphinx": { "execute": "never" } }, "nbformat": 4, "nbformat_minor": 2 }