My problem: Code works great, but the speed is far too slow for the number of rows it needs to process.
What I am doing: I initiate a COUNT(*) to pull total number of rows (which last night was ~58000) and I use that to create the loop to do the following: pull two columns of data from that row, datamine one column for text patterns.
Once I had completed that, I search a second table to see if that individual by username is already existing - if they exist, I update their row. If not, I add a new one.
There are 44 columns of data, one being the name the other 43 storing values of my datamining results.
In about 8 hours, it has completed 26500 out of the 58000 when it first started (in that same period, the table has grown to ~100000, but I am not worried about that).
Is there a better method to increase read/write rate?
Portion of my code - I have removed many of the int declarations and Regex.Matches as they are copies of the first with altered Match values.
azCheck is to determine if the message even contains anything we are looking for, it is remains '0', then we don't bother with the last portion of the code.
using (new MySqlConnection(ConnectiongString))
{
using (MySqlCommand cmd = new MySqlCommand("select count(*) from messages", connection))
{
using (MySqlDataReader reader = cmd.ExecuteReader())
{
StringBuilder sb = new StringBuilder();
while (reader.Read())
{
sb.Append(reader.GetInt32(0).ToString());
}
total_messages = int.Parse(sb.ToString());
}
}
}
Console.WriteLine(total_messages.ToString());
connection.Close();
for (int i = 1; i <= total_messages; i++)
{
connection.Open();
using (new MySqlConnection(ConnectiongString))
{
using (MySqlCommand cmd = new MySqlCommand("select * from messages WHERE id="+i+"", connection))
{
using (MySqlDataReader reader = cmd.ExecuteReader())
{
StringBuilder sb = new StringBuilder();
while (reader.Read())
{
username = reader["username"].ToString();
message = reader["message"].ToString();
}
}
}
}
connection.Close();
Console.Write("\r{0} ", i);
int aiCount = 0;
aiCount += Regex.Matches(message, "ai", RegexOptions.IgnoreCase).Count;
azCheck += aiCount;
//There are ~42 of the regex.matches after the first one.
MySqlCommand cmd1 = connection.CreateCommand();
connection.Open();
cmd1.CommandText = "SELECT username FROM users";
cmd1.CommandType = CommandType.Text;
cmd1.Connection = connection;
MySqlDataReader dr = cmd1.ExecuteReader();
while (dr.Read())
{
if (dr[0].ToString() == username)
{
check++;
}
}
connection.Close();
if (check == 0)
{
MySqlConnection connection2 = new MySqlConnection(ConnectiongString);
connection2.Open();
try
{
MySqlCommand cmd2 = connection2.CreateCommand();
cmd2.CommandText = "INSERT INTO users (username,aiCount) VALUES (@username,@aiCount)";
cmd2.Parameters.AddWithValue("@username", username);
cmd2.Parameters.AddWithValue("@aiCount", aiCount);
cmd2.ExecuteNonQuery();
connection2.Close();
}
catch (Exception)
{
throw;
}
} else {
int aiCount_old = 0;
if (azCheck > 0)
{
//Here we are taking the existing values from this users row,
//which we then add the new values from above and save.
MySqlConnection connection4 = new MySqlConnection(ConnectiongString);
connection4.Open();
try
{
MySqlCommand cmd2 = connection4.CreateCommand();
cmd2.CommandType = CommandType.Text;
cmd2.CommandText = "SELECT * from users WHERE username = @username";
cmd2.Parameters.AddWithValue("@username", username);
MySqlDataReader reader = cmd2.ExecuteReader();
while (reader.Read())
{
aiCount_old = Convert.ToInt32(reader["aiCount"].ToString());
}
}
catch (Exception)
{
throw;
}
connection4.Close();
aiCount += aiCount_old;
MySqlConnection connection5 = new MySqlConnection(ConnectiongString);
connection5.Open();
try
{
MySqlCommand cmd4 = connection5.CreateCommand();
cmd4.CommandType = CommandType.Text;
cmd4.CommandText = "UPDATE users SET aiCount = @aiCount WHERE LOWER(LTRIM(RTRIM(username))) = @username";
cmd4.Parameters.AddWithValue("@username", username.Trim().ToLower());
cmd4.Parameters.AddWithValue("@aiCount", aiCount.ToString());
cmd4.ExecuteNonQuery();
Console.WriteLine("User updated.");
}
catch (Exception ex)
{
throw;
}
connection5.Close();
INSERT INTO..SELECT? The fastest way, of course, is to do it all in one query. Not sure if that's possible, though. Can you post some sample data?INSERT INTOwon't work since I need to pull the existing value in order to add.INSERT INTOwon't work, my mistake. I agree with Paul, though. It's better to select all the rows and iterate through them, then to make tons of tiny queries. Going back and forth like that could bottleneck performance.