Accessing the data collected using RADAR-Base #
RADAR-base provides two storage systems, namely Cold Storage and Hot Storage. Cold storage stores the high resolution data sent to the platform in a Hadoop Distributed File System(HDFS). The data from HDFS can be used for retrospective analysis. The data in HDFS are stored based on topics. Hence, a data extracting program was implemented to extract data from HDFS and restructure them based on study->participant->data topics->date_hour.csv .
On the other hand, the Hot storage stores aggregated data computed in real-time using Kafka Streams. The data stored in Hot storage can be access using the REST-API provided by the RADAR-base or using the RADAR-base Dashboard component. This data pipeline is still under development.
In this article we will focus on how you can access the data collected on HDFS using RADAR-base. As mentioned earlier, the data storage and extraction is automated.
Let’s consider that you have set-up a server using RADAR-Docker and created a project called radar-test and you have two study participants(subjects) in that project who were sending data to your server using pRMT app. Now you want to access the data collected by the platform for further analysis.
The project-view of the “radar-test” project. #
Note: Please note the data extraction is scheduled to be run every hour. Try to follow the steps below after one or two hours. Ideally after the period of data collection.
- Root access on the server which hosts the RADAR-base platform using RADAR-Docker
How to access the data collected using RADAR-base #
Step 1: Login and go to the root folder of RADAR-Docker
Step 2: Navigate to radar-cp-hadoop-stack folder
Step 3: By default, the platform stores the extracted data from HDFS under the folder named output. You can verify the folder name on your environment by looking at the .env file at the variable called RESTRUCTURE_OUTPUT_DIR. You will find a folder named output under the parent directory of RADAR-Docker.
RADAR-Docker/dcompose-stack/radar-cp-hadoop-stack# ls -a . .. .env .gitignore README.md bin docker-compose.yml etc hash-backup images lib log optional-services.yml output postgres-backup travis
You will see a folder called output created. This folder contains all of the data collected on your instance (until the time you are accessing the server).
Step 4: Navigate to the output folder and list currently available files under output folder
RADAR-Docker/dcompose-stack/radar-cp-hadoop-stack/output# ls +tmp bins.csv offsets.csv radar-test snapshots
You will see a folder under your project name. This folder contains the data submitted by participants of that projects.
Step 5: View available subjects/participants under that project.
RADAR-Docker/dcompose-stack/radar-cp-hadoop-stack/output# cd radar-test/ RADAR-Docker/dcompose-stack/radar-cp-hadoop-stack/output/radar-test# ls 57d92da1-552d-4747-ad7c-d0059d51f621 dbd52d98-e867-4e3f-a99f-b6d496429f31
Step 6: To download data it is recommended to use the snapshot versions of data. Navigate to the snapshot folder and list available files.
RADAR-Docker/dcompose-stack/radar-cp-hadoop-stack/output/radar-test/57d92da1-552d-4747-ad7c-d0059d51f621# cd ../../snapshots/ RADAR-Docker/dcompose-stack/radar-cp-hadoop-stack/output/snapshots# ls +tmp radar-test
Step 7: Navigate to your project folder. When you lists the files, you will see snapshot files created for every month that can be easily downloaded as a package.
RADAR-Docker/dcompose-stack/radar-cp-hadoop-stack/output/snapshots# cd radar-test/ RADAR-Docker/dcompose-stack/radar-cp-hadoop-stack/output/snapshots/radar-test# ls 201810.tmp.zip 201810.zip 201811.zip 201812.zip 201901.zip