Commit 442ac255 authored by Reinhard Munz's avatar Reinhard Munz

Update README.md

parent 6c084771
# UniTraX
UniTraX is a data analytics system that provides users with a personal differentially private bound on privacy loss. UniTraX is able to allow more queries than previous systems, without giving up on analytic accuracy.
\ No newline at end of file
UniTraX is a data analytics system that provides users with a personal
differentially private bound on privacy loss. UniTraX is able to allow more
queries than previous systems, without giving up on analytic accuracy.
This repository contains a research prototype implementation of UniTraX.
## Installation
The following reflects the setup we used for our experiments.
* Windows Server 2016 (on two separate machines)
* MS SQL Server (on Machine 1; this is the DB server)
* Visual Studio (on Machine 2; this is the UniTraX server and client machine)
* C# framework 4.7.2
* Git
* PINQ from https://www.microsoft.com/en-us/research/project/privacy-integrated-queries-pinq/
* Java
* Financial/Medical data from https://sorry.vse.cz/~berka/challenge/pkdd1999/chall.htm
* Mobility data from https://www1.nyc.gov/site/tlc/about/tlc-trip-record-data.page (we used an older version of January 2013 yellow cab data)
One must apply the patch from the source folder to PINQueryable.cs.
Visual studio should tell which NuGet packages must be installed.
The record transformer changes the data to the format required by UniTraX. It is
implemented in Java and was only ever used on OSX.
We prepared the mobility data manually. Thus, there is no record transformer for
that dataset in the repository. Please have a look at the mobility dataset's
data access files to infer a working schema and produce corresponding csv files.
## Operation
In our setup Machine 1 has nothing but MS SQL Server installed and works as the
remote database server.
Machine 2 does anything else. This includes experiment setup / teardown.
Experiments roughly follow the following plan:
1. Start Experiment App on Machine 2
2. Connect to Machine 1
* Drop any databases
* Stop MS SQL service
* Remove RAM-disk
* Create RAM-disk
* Start MS SQL service
2. Load data from disk of Machine 2 into RAM of Machine 2
3. Connect to Machine 1
* Create DB (with files on RAM-disk)
* Create tables
* Load data into tables (add budget information on the fly)
* Create indexes
* Make DB read-only
4. Setup one of Direct (no protection), PINQ, or UniTraX (utilizing PINQ)
5. Execute queries
6. Dump log
7. Query for remaining privacy budgets (this sometimes takes a long time)
8. Dump budget data
We used the log converter to read the logs and plot graphs with gnuplot. Again,
this utility is implemented in Java and was only ever used on OSX.
Good luck :D
\ No newline at end of file
Markdown is supported
0%
or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment