CodeGuru Home VC++ / MFC / C++ .NET / C# Visual Basic VB Forums Developer.com
Results 1 to 6 of 6
  1. #1
    Join Date
    Dec 2011
    Posts
    7

    Feed data from CSV to variables as pair of numbers

    Hello all,

    I have a project fro school i need to some help with. I am calculating regression on a set of numbers from a CSV file. The CSV file has a pair for number where each line represent a pair of X and Y:

    22.33,22.42
    11.23,24.23
    ...

    I need to write a code that ask the user to upload their CSV file from the file system and parse the variables into to arrays reading doubles.

    I have the code for the regression side i just don't know how to upload the file and parse to my need.

    Code:
     
    
    public class LinearRegression { 
    
        public static void main(String[] args) { 
            int MAXN = 1000;
            int n = 0;
            double[] x = new double[MAXN];
            double[] y = new double[MAXN];
    
            // first pass: read in data, compute xbar and ybar reading from conosle
            double sumx = 0.0, sumy = 0.0, sumx2 = 0.0;
            while(!StdIn.isEmpty()) {
                x[n] = StdIn.readDouble();
                y[n] = StdIn.readDouble();
    			
                sumx  += x[n];
                sumx2 += x[n] * x[n];
                sumy  += y[n];
                n++;
            }
            double xbar = sumx / n;
            double ybar = sumy / n;
    
            // second pass: compute summary statistics
            double xxbar = 0.0, yybar = 0.0, xybar = 0.0;
            for (int i = 0; i < n; i++) {
                xxbar += (x[i] - xbar) * (x[i] - xbar);
                yybar += (y[i] - ybar) * (y[i] - ybar);
                xybar += (x[i] - xbar) * (y[i] - ybar);
            }
            double beta1 = xybar / xxbar;
            double beta0 = ybar - beta1 * xbar;
    
            // print results
            System.out.println("y   = " + beta1 + " * x + " + beta0);
    
            // analyze results
            int df = n - 2;
            double rss = 0.0;      // residual sum of squares
            double ssr = 0.0;      // regression sum of squares
            for (int i = 0; i < n; i++) {
                double fit = beta1*x[i] + beta0;
                rss += (fit - y[i]) * (fit - y[i]);
                ssr += (fit - ybar) * (fit - ybar);
            }
            double R2    = ssr / yybar;
            double svar  = rss / df;
            double svar1 = svar / xxbar;
            double svar0 = svar/n + xbar*xbar*svar1;
            System.out.println("R^2                 = " + R2);
            System.out.println("std error of beta_1 = " + Math.sqrt(svar1));
            System.out.println("std error of beta_0 = " + Math.sqrt(svar0));
            svar0 = svar * sumx2 / (n * xxbar);
            System.out.println("std error of beta_0 = " + Math.sqrt(svar0));
    
            System.out.println("SSTO = " + yybar);
            System.out.println("SSE  = " + rss);
            System.out.println("SSR  = " + ssr);
        }
    }
    The code above does what i need but it uses the console to feed in numbers. i want to be able to upload a file and do the same operation on the file.

    Thank you for helping!

  2. #2
    Join Date
    May 2006
    Location
    UK
    Posts
    4,473

    Re: Feed data from CSV to variables as pair of numbers

    Google for "Java read csv file"
    Posting code? Use code tags like this: [code]...Your code here...[/code]
    Click here for examples of Java Code

  3. #3
    Join Date
    Dec 2011
    Posts
    7

    Post Re: Feed data from CSV to variables as pair of numbers

    Thanks for your reply. I started doing this:

    Code:
    import java.io.*;
     
    public class InputTest {
        public static void main(String[] args) {
            double[] xlines = new double[0]; 		//array to be used for the X values
            double[] ylines = new double[0]; 		//array to be used for the Y values
            String path = "c:\\dev\\cis.csv";	    //path
            double value = 0;
            double sumx = 0.0, sumy = 0.0, sumx2 = 0.0; //this is for late use to caluculate Beta0, Beta1 and equation
            BufferedReader br = null;
            
            try {
                File file = new File(path);         //file as the file itself
                br = new BufferedReader(
                     new InputStreamReader(
                     new FileInputStream(file)));
                String line; 						//creates a variable line (string)
                while( (line = br.readLine()) != null ) {
                	String[] values = line.split(",");  // this separates the values
                	for (String str : values){            // for each value do the following.
                    value = convert(str, value);
                	xlines = add(value, xlines);
                	}
                }
                br.close();
            } catch(IOException e) {
                System.out.println("read error: " + e.getMessage());
            }
            print(xlines);
        }
     
        private static double[] add(double s, double[] lines) {
            int len = lines.length;
            double[] temp = new double[len+1];
            System.arraycopy(lines, 0, temp, 0, len);
            temp[len] = s;
            return temp;
        }
     
        private static void print(double[] lines) {
            for(int i = 0; i < lines.length; i++)
                System.out.println(lines[i]);
        }
    
    	  public static double convert(String s, double variable) {
    	    String aString = s;
    	    double aDouble = Double.parseDouble(aString);
    		return aDouble;
    	  }
    	}
    the issue i am facing now is how to run a loop which add the first value to one array (xlines) and the second value to the another array (ylines).

    Thanks!

  4. #4
    Join Date
    May 2006
    Location
    UK
    Posts
    4,473

    Re: Feed data from CSV to variables as pair of numbers

    Why are you using arrays, why not use ArrayLists which automatically grow as you add more elements?

    Why does your convert() method take 2 parameters but only use 1 of them?

    the issue i am facing now is how to run a loop which add the first value to one array (xlines) and the second value to the another array (ylines).
    Well you have a few choices:
    1. As it's only 2 parameters you could just take the brute force approach and duplicate the two lines ie
    Code:
        value = convert(str[0], value);
        xlines = add(value, xlines);
        value = convert(str[1], value);
        ylines = add(value, ylines);
    2. Or you could use a boolean flag to switch between the arrays ie
    Code:
    boolean first = true;
    for (String str : values) {            // for each value do the following.
        value = convert(str, value);
        if ( first )
            xlines = add(value, xlines);
        else
            ylines = add(value, ylines);
        first = ! first;
    }
    3. Or rather then 2 line arrays use a 2 dimensional one and parallel index the arrays ie
    Code:
    double[][] lines = new double[2][0];
    for ( int i = 0; i < values.length; i++ ) {
        value = convert(values[i], value);
        lines[i] = add(value, lines[i]);
    }
    Option 1 has the advantage of being easy to do and read, option 2 uses the most lines of code for no real advantage and option 3 is harder to understand and may require changes to the rest of your code but does have the advantage that if you wanted to read in 3 or more dimensions it's just a matter of changing the size of the first dimension of the lines array.
    Posting code? Use code tags like this: [code]...Your code here...[/code]
    Click here for examples of Java Code

  5. #5
    Join Date
    Dec 2011
    Posts
    7

    Re: Feed data from CSV to variables as pair of numbers

    Why does your convert() method take 2 parameters but only use 1 of them?
    Silly mistake, fixed this.


    I'll try the rest now and get back to you.

    Thank you so much for clarifying.

  6. #6
    Join Date
    Dec 2011
    Posts
    7

    Re: Feed data from CSV to variables as pair of numbers

    Option 2 worked fine for me.

    Thanks a lot!

Tags for this Thread

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •  





Click Here to Expand Forum to Full Width

Featured