CSCI 13300 SP 2025

Go to homepage

Last updated: 2025-04-05

Unit 11 Assessment

Due date and submission

This assignment is due April 26th at 11:59 PM. Submit your solution on Brightspace, under the “Unit 11” assignment.

Please copy your code into the text box, making sure to indent it properly with whitespace so that it appears the same as in IDLE or VSCode or wherever you wrote the code. This will make it easier for me to grade.

You can submit multiple times. I will only grade your last submission.

Prerequisites

In order to do this assignment you will need pandas and matplotlib installed. We went through how to do that in class. If you look it up online, you should be able to figure it out.

Data

You will use these datasets in your assignment:

Helper functions

Here are some helper functions for plotting. The function plot_regression is tailored to dealing with the data for KKR (and other) stocks, while plot_coords will work for the general “Coordinates” file given above.

When predicting values based on the regression, you may want to print out the coefficients generated by np.polyfit within the plot_regression function, and plug them into Desmos to see how the function behaves. You will need to figure out how the conversion is being done between the numeric index and the date in order to understand how the function given by the regression coordinates relates to date, so that you can evaluate the function in the year 2027.

def plot_coords(filename):
    df = pd.read_csv(filename, header=None)
    plt.plot(df[0],df[1],'.')

def plot_regression(filename, col_x, col_y, degree, num_days=None, stripchar='$'):
    df = pd.read_csv(filename)
    df[col_y] = df[col_y].str.lstrip(stripchar).astype(float)
    df[col_x] = pd.to_datetime(df[col_x], format='%m/%d/%Y')

    # Sort from earliest to latest
    df = df.sort_values(by=col_x)

    # Filter to the most recent num_days, if specified
    if num_days is not None:
        latest_date = df[col_x].max()
        earliest_date = latest_date - pd.Timedelta(days=num_days)
        df = df[df[col_x] >= earliest_date]

    df.reset_index(drop=True, inplace=True)

    # Fit polynomial to the index (which is now chronological)
    coeffs = np.polyfit(df.index, df[col_y], deg=degree)

    # Generate line for plotting
    line_x = df.index[::max(1, len(df)//100)]
    line_y = np.polyval(coeffs, line_x)

    # Plot
    plt.figure(figsize=(10, 6))
    plt.plot(df.index, df[col_y], label="Data")
    plt.plot(line_x, line_y, color='red', label=f"Poly (deg {degree})")
    plt.xticks(df.index[::max(1, len(df)//10)], df[col_x].dt.date[::max(1, len(df)//10)], rotation=25)
    plt.legend()
    plt.title(f"Polynomial Regression (Last {num_days} Days)" if num_days else "Polynomial Regression")
    plt.xlabel(col_x)
    plt.ylabel(col_y)
    plt.tight_layout()
    plt.show()

Tasks

When answering non-code questions, write the answer as a comment in the file next to the relevant pieces of code you wrote to get your answer. You should have one comment for each of the below tasks.

Note: For the stock prediction questions: If you are having trouble predicting the exact values for the year 2027 using the regression coefficients, you can just try extending the graph manually and guessing what the values will be.

  1. Plot the coordinates in the Coordinates 4 file. What do you see?
  2. Plot the trajectory of the KKR stock over the last 5 years (the entire dataset). Do a linear regression (deg=1) using np.polyfit like we saw in class, and use the resulting parameters to predict what the price of KKR stock will be in 2027.
  3. Now, do a quadratic regression (with np.polyfit and deg=2) on the KKR stock data for the last 240 days. You should get 3 numbers as a result, which correspond to the coordinates m_2, m_1, and m_0 respectively of the polynomial y = m_2x^2 + m_1x + m_0. Based on this regression, what do you expect the price of KKR stock to be in 2027?

Follow-up question (not graded, for fun): If someone was given $15,000 in KKR RSUs (restricted stock units) on February 14th of 2024, how much money (in unrealized gains) has that person lost by April 4th, 2025, when compared to the peak attained over the ownership timeframe? (The answer should be a little over $10,000.)

Notes

You should be able to do all of the tasks with only the Python topics we covered in class so far.

If you want to use more complex functionality than what we discussed in class, the Python documentation may be helpful: Python 3.10 documentation

Additionally, the pandas and matplotlib documentation may be helpful: pandas documentation, matplotlib documentation