Developer Resources
Developer Resources
Downloading Data from AWS S3
To list and download ordered files you can use the following commands. Please note that you only have access to the ordered time range and countries.
List all countries in our AWS S3 Bucket
aws s3 ls --profile YOUR_CREDENTIALS_FOR_TECHMAP \
--summarize s3://BUCKET_NAME
List all files in UK directory
aws s3 ls --profile YOUR_CREDENTIALS_FOR_TECHMAP \
--summarize s3://BUCKET_NAME/uk/
Download one file for the USA from May 4th 2023
aws s3 cp --profile YOUR_CREDENTIALS_FOR_TECHMAP \
s3://BUCKET_NAME/us/techmap_jobs_us_2023-05-04.jsonl.gz .
Download all files for the USA from April 2023
aws s3 sync --profile YOUR_CREDENTIALS_FOR_TECHMAP \
s3://BUCKET_NAME/us/ . \
--exclude "*" --include "techmap_jobs_us_2023-04-*"
Parsing Data Files
In JavaScript / TypeScript you can easily use JSON.parse() to parse all lines in the JSON files within a directory.
const fs = require('fs');
const path = require('path');
const readline = require('readline');
const directoryPath = './techmap/files/...';
fs.readdir(directoryPath, function (err, files) {
if (err) {
console.log('Unable to scan directory: ' + err);
return;
}
// Iterate through each file in the directory
files.forEach(function (file) {
// Check if file is a JSON file
if (path.extname(file) === '.json') {
// Read the file
const readStream = fs.createReadStream(directoryPath + file);
const rl = readline.createInterface({
input: readStream,
crlfDelay: Infinity
});
rl.on('line', function (line) {
try {
// Parse the JSON data in the line
const jsonData = JSON.parse(line);
// TODO: do something with the JSON - e.g., store in your own DB
console.log(jsonData);
} catch (err) {
console.log('Unable to parse JSON data in file ' + file + ' on line: ' + line + ': ' + err);
}
});
}
});
});
With the Java programming language you can use Java's jasonx library to parse all lines in the JSON files within a directory.
import java.io.IOException;
import java.nio.file.*;
import java.util.List;
import java.util.stream.Collectors;
import javax.json.*;
String directoryPath = "./techmap/files/...";
// Parse all JSON files in the directory
List<JsonObject> jsonObjects = Files.list(Path.of(directoryPath))
.filter(path -> path.toString().endsWith(".json"))
.map(path -> {
try (JsonReader reader = Json.createReader(Files.newBufferedReader(path))) {
return reader.readObject();
} catch (IOException e) {
throw new RuntimeException(e);
}
})
.collect(Collectors.toList());
// Do something with the list of JSON objects
for (JsonObject jsonObject : jsonObjects) {
// Process each JSON object
// TODO: do something with the JSON - e.g., store in your own DB
System.out.println(jsonString.toString());
}
For other Programming languages see the libraries section of the JSON Homepage. And remember that we have sample data you can test with.