I am trying to run a program that analyzes a bunch of text files containing numbers. The total size of the text files is ~12 MB, and I take 1,000 doubles from each of 360 text files and puts them into a vector. My problem is that I get about halfway through the list of text files and then my computer slows down until it isn't processing any more files. The program is not infinite looping, but I think I have a problem with using too much memory. Is there a better way to store this data that won't use as much memory?
Other possibly relevant system information:
Running Linux
8 GB memory
Cern ROOT framework installed (I don't know how to reduce my memory footprint with this though)
Intel Xeon Quad Core Processor
If you need other information, I will update this list
EDIT: I ran top, and my program uses more memory, and once it got above 80% i killed it. There's a lot of code, so I'll pick out the bits where memory is being allocated and such to share. EDIT 2: My code:
void FileAnalysis::doWork(std::string opath, std::string oName)
{
//sets the ouput filepath and the name of the file to contain the results
outpath = opath;
outname = oName;
//Reads the data source and writes it to a text file before pushing the filenames into a vector
setInput();
//Goes through the files queue and analyzes each file
while(!files.empty())
{
//Puts all of the data points from the next file onto the points vector then deletes the file from the files queue
readNext();
//Places all of the min or max points into their respective vectors
analyze();
//Calculates the averages and the offset and pushes those into their respective vectors
calcAvg();
}
makeGraph();
}
//Creates the vector of files to be read
void FileAnalysis::setInput()
{
string sysCall = "", filepath="", temp;
filepath = outpath+"filenames.txt";
sysCall = "ls "+dataFolder+" > "+filepath;
system(sysCall.c_str());
ifstream allfiles(filepath.c_str());
while (!allfiles.eof())
{
getline(allfiles, temp);
files.push(temp);
}
}
//Places the data from the next filename into the files vector, then deletes the filename from the vector
void FileAnalysis::readNext()
{
cout<<"Reading from "<<dataFolder<<files.front()<<endl;
ifstream curfile((dataFolder+files.front()).c_str());
string temp, temptodouble;
double tempval;
getline(curfile, temp);
while (!curfile.eof())
{
if (temp.size()>0)
{
unsigned long pos = temp.find_first_of("\t");
temptodouble = temp.substr(pos, pos);
tempval = atof(temptodouble.c_str());
points.push_back(tempval);
}
getline(curfile, temp);
}
setTime();
files.pop();
}
//Sets the maxpoints and minpoints vectors from the points vector and adds the vectors to the allmax and allmin vectors
void FileAnalysis::analyze()
{
for (unsigned int i = 1; i<points.size()-1; i++)
{
if (points[i]>points[i-1]&&points[i]>points[i+1])
{
maxpoints.push_back(points[i]);
}
if (points[i]<points[i-1]&&points[i]<points[i+1])
{
minpoints.push_back(points[i]);
}
}
allmax.push_back(maxpoints);
allmin.push_back(minpoints);
}
//Calculates the average max and min points from the maxpoints and minpoints vector and adds those averages to the avgmax and avgmin vectors, and adds the offset to the offset vector
void FileAnalysis::calcAvg()
{
double maxtotal = 0, mintotal = 0;
for (unsigned int i = 0; i<maxpoints.size(); i++)
{
maxtotal+=maxpoints[i];
}
for (unsigned int i = 0; i<minpoints.size(); i++)
{
mintotal+=minpoints[i];
}
avgmax.push_back(maxtotal/maxpoints.size());
avgmin.push_back(mintotal/minpoints.size());
offset.push_back((maxtotal+mintotal)/2);
}
EDIT 3: I added in the code to reserve vector space and I added code to close the files, but my memory still gets filled to 96% before the program stops...
topwhile you are doing the work to see how much memory you are using? Could you use a pre-allocated array of doubles instead of a higher-overhead structure like std::vector? One potential problem is that it could be fragmenting memory trying to get a large contiguous amount of memory.