"My Computer Science Degree Journey" by Martin

Posts

CST 383 - Week 7

February 24, 2026

In this lecture, I learned about logistic regression and how it functions as a classification model. A key insight was understanding that logistic regression can be viewed as linear regression combined with a sigmoid “squashing” function, which transforms the output into a probability between 0 and 1. I learned how to describe the model both mathematically and conceptually. It computes a linear combination of input features and applies the sigmoid function to estimate the probability of a binary outcome. I also learned how to select the best logistic regression model using training data and evaluate its performance using appropriate metrics. For homework, I implemented logistic regression using Scikit-Learn to predict customer churn. I trained the model, generated predictions, and evaluated its performance. This hands-on practice helped me connect the theoretical concepts to practical implementation.

CST 383 - Week 5

February 10, 2026

Something I would like to analyze and talk about is the section on Missing Data. Before reading about it, I did not really consider how important it is to properly deal with missing data. I used to think that you could just remove rows or ignore the problem, but now I understand that handling missing values is a big part of data science. The way pandas provides built-in functions to detect, remove, and replace missing data is very helpful. In the examples from the book and slides, the datasets were small, so it looked simple to manage. However, I imagine that when working with much larger datasets, these tools become even more important and powerful. I am still thinking about when it is better to delete rows or columns versus when it is better to replace the missing values. It seems like deleting data could remove useful information, especially if many rows contain missing values. On the other hand, replacing values with the mean, median, or mode could introduce bias if it is not...

CST 383 - Week 3

January 27, 2026

This week focused on data visualization, especially working with one variable and two variables. I feel more comfortable using pandas to explore data and Matplotlib to create histograms and other plots, but I’m still getting used to deciding which visualization is the best choice for a given dataset. Sometimes different plots seem like they could work, and I’m not always sure which one communicates the information most clearly. I understand the general idea behind histograms, bar charts, and scatter plots, but I still have questions about how changes in bin size or scale can affect how the data is interpreted. Another thing I am still thinking about is how much a visualization can influence conclusions, since the same data can look very different depending on how it is displayed. Overall, this week made me realize that data visualization is not just about making graphs, but about making choices that affect how data is understood.

CST 383 - Week 2

January 20, 2026

Before talking about new material, I want to mention that after a review I was able to understand more about machine learning this week. This week was about Pandas, a Python library for manipulating data. I learned some functions on how to use Pandas to visualize and edit data. I also learned some statistics concepts. I have some understanding of them, but I had forgotten a lot, so I had trouble understanding the mean in a PDF. Topics like skew and standard deviation I still somewhat understand, but besides that, I’m probably going to review more, just like I did with machine learning last week. I’m also going to look for more Pandas tutorials.

CST 383 - Week 1

January 13, 2026

This week was an introduction week. The slides covered a review of Python and also a guide on how to install Anaconda, which is a distribution platform. In my case, I had never used Anaconda before or the Spyder IDE, but it was easy to use. I still haven’t used all the tools in Spyder, but I will probably use more of them when I work with Matplotlib. I worked and practiced with the NumPy library, which has functions to work with arrays faster. I had practiced it before, but honestly I forgot the syntax, so today I felt refreshed. I also learned about filtering, which was new for me because I had never worked with NumPy and boolean arrays before. For the machine learning part, I still have trouble understanding the training data. I get confused about what the data points are and whether they come from the Iris dataset or are created by the program. I also get more confused with the k-nearest neighbors algorithm, but I am still studying so I can understand it better.

CST 370 - Week 7

October 21, 2025

I always had trouble understanding what dynamic programming is. I used to think it was just a function or something similar, but I think I understand it better now. It’s more like a strategy to solve problems, and this approach is used in algorithms such as Bellman-Ford or Floyd-Warshall.

Search This Blog