More CSV Exercises

DDOS CSV File

Write a program, ddos_reformat.py, that will convert the text file ddos_attack_record.txt to a comma-delimited file named ddos_attack_record.csv. Maintain the column order.

The first 3 lines of ddos_attack_record.txt look like this:

TCP   192.168.2.401:54599    86.22.213.44:30739    ESTABLISHED
TCP   192.168.2.401:54633    177.19.180.123:9433    PIN_WAIT_2
TCP   192.168.2.401:54651    179.79.57.231:29820    ESTABLISHED

The first 3 lines of ddos_attack_record.csv should look like this:

TCP,192.168.2.401:54599,86.22.213.44:30739,ESTABLISHED
TCP,192.168.2.401:54633,177.19.180.123:9433,PIN_WAIT_2
TCP,192.168.2.401:54651,179.79.57.231:29820,ESTABLISHED

Split Logs

Write a program, oobelib_split.py which reads the pipe-delimited file oobelib.csv and creates a separate CSV files for all debug, info, warn, and error entries in oobelib.csv. Make sure to include all relevant information from each line in its respective file.

Name the files oobelib-debug.csv, oobelib-error.csv, oobelib-info.csv, and oobelib-warning.csv.

For example, the first 3 lines of oobelib-error.csv would be:

04/14/16 12:45:43:937,[ERROR],,,,OPMWrapper,,,16423,Failed in getting value for key in OPMGetValueForKey
04/14/16 14:03:29:118,[ERROR],,,,SLCoreService,,,47967,Could not find license from which to get user data
04/14/16 14:03:29:118,[ERROR],,,,SLCoreService,,,47967,No value for key [PersonGUID] in user dictionary.

Pipe to Comma

Write a program, pipe_to_comma.py, that reads a pipe-delimited file and outputs a comma-delimited file.

The program should take a pipe-delimited filename as input and add a .csv extension to create the CSV output filename.

Test the program with the oobelib.csv file.

Code Review Pairings

Write a program that matches up code code reviewers such that each person in a list of people is reviewing 2 other people’s code and no two people are reviewing each others code.

Example input file:

Amber
Christopher
Katie
Russell
Charlotte
Elias
Chiang

Example output file:

Name,To Review,To Review,Reviewed By,Reviewed By
Chiang,Katie,Christopher,Elias,Russell
Katie,Christopher,Amber,Chiang,Russell
Christopher,Amber,Charlotte,Chiang,Katie
Amber,Charlotte,Elias,Katie,Christopher
Charlotte,Elias,Russell,Christopher,Amber
Elias,Russell,Chiang,Amber,Charlotte
Russell,Chiang,Katie,Charlotte,Elias

Here’s a script to test the output file you generate:

import csv
import sys


def fail(message):
    print(message)
    exit(1)


def main(input_file):
    reviewers_for = {}
    reviewees_for = {}
    reader = csv.reader(input_file)
    next(reader)
    for name, reviewer1, reviewer2, reviewee1, reviewee2 in reader:
        reviewers = set((reviewer1, reviewer2))
        reviewees = set((reviewee1, reviewee2))
        if name in reviewers:
            fail(f"{name} is their own reviewer.")
        if name in reviewees:
            fail(f"{name} is their own reviewee.")
        if reviewers & reviewees:
            fail(f"{name} has overlapping reviewers and reviewees.")
        reviewers_for[name] = reviewers
        reviewees_for[name] = reviewees
    for name, reviewees in reviewees_for.items():
        for other in reviewees:
            if name not in reviewers_for[other]:
                fail(f"{other} is {name} reviewee, {name} isn't {other} reviewer")
    print("Looks good.")


if __name__ == '__main__':
    with open(sys.argv[1]) as input_file:
        main(input_file)

Pipe-delimited Log File Converter

Write a program, oobelib_convert.py which reads the oobelib.log file and outputs a modified version of it which can be read by the csv module (maintain the | delimiter). Name the new file oobelib.csv.

The first 3 lines of the oobelib.log input file look like this:

04/14/16 12:45:43:694 | [INFO] |  |  |  | OOBELib |  |  | 16423 | __OOBELIB_LOG_FILE__
04/14/16 12:45:43:694 | [INFO] |  |  |  | OOBELib |  |  | 16423 | *************OOBELib Session Starts*************
04/14/16 12:45:43:694 | [INFO] |  |  |  | OOBELib |  |  | 16423 | Version 9.0.0.2,7.0

We want the first 3 lines of the oobelib.csv output file to look like this:

04/14/16 12:45:43:694|[INFO]||||OOBELib|||16423|__OOBELIB_LOG_FILE__
04/14/16 12:45:43:694|[INFO]||||OOBELib|||16423|*************OOBELib Session Starts*************
04/14/16 12:45:43:694|[INFO]||||OOBELib|||16423|Version 9.0.0.2,7.0

Track RPMs

Given this file of drive data for hard drives that should each have a 7200 rpm, write a program rotation_rate.py that does the following:

  1. Reads a csv file containing timed records of rotations (in degrees) and temperatures of different make/models of hard drives.

  2. Returns a dictionary containing the make and model as keys, and a list of RPMs per reading as values. The rpm values should be for each individual reading, not the cumulative rpm over time.

Time between readings can be assumed to be .001 second.

Note

You may use the following formula for rpm: (current_degrees - last_degrees) / 360 / .001 * 60

Example usage:

$ python rotation_rate.py
{('Z', '1'): [7116.66, 7266.66, 7233.33], ('Z', '2'): [7183.33, 7300.00, 7250.00], ('Z', '3'): [7150.00, 7366.66, 7083.33]}

The output should look like this file.

Parsing Ping Data

Create a program parse_pings.py which should do the following:

  1. Read each line of this ping output file and extract the values for bytes, IP, icmp_seq, ttl, and time.

  2. Create the CSV file and write one line per row.

The CSV file should be in this format:

bytes,IP,ICMP Sequence,TTL,time (ms)
64,182.162.94.6,7,45,175

Run your program like this:

$ python parse_pings.py pings.txt pings.csv

For reference, the csv file you produce should look like this file