Using python to predict data

5 hours ago 2
ARTICLE AD BOX

I'm wanting to use Python to create an additional 5 sets of data per distance PER material I have. I have my CSV's Set up into two columns as such with 5 pieces of data per distance:

Distance(mm) Counts
100 112
100 105
100 119
100 122
100 117
150 89
150 84

and so on up to 300mm increasing in 50mm increments. I have multiple CSV's set up like this for each material. Now what I want to do is increase each distances data from 5 to 10, 5 real world data and 5 predicted synthetic data to increase the strength of my lab reports results and to look better to the lecturers.

Now what I have for code is the following for one of the CSV's:

X_lead = lead.drop('counts', axis = 1) y_lead = lead['counts'] x_lead_train, x_lead_test, y_lead_train, y_lead_test = train_test_split(X_lead, y_lead, test_size=0.2, random_state=42 ) model = LinearRegression() model.fit(x_lead_train, y_lead_train) predictions = model.predict(x_lead_test) print(predictions) print(x_lead_train.shape, y_lead_train.shape) print(x_lead_test.shape, y_lead_test.shape) [25.825 25.175 26.15 24.85 25.5 ] (20, 1) (20,) (5, 1) (5,)

Now what I'm noticing is its created 5 sets of data but I have 5 distances and what I'm assuming is that this created 1 additional piece of data per distance which is nice but I don't want ONE per distance. I want 5 per distance so I'm wondering what the better way to get around this is?

I have loaded each CSV as a dataframe into my code so I can easily re-use code to gain the new data.

Read Entire Article