Skip to main content
deleted 44 characters in body; edited tags; edited title
Source Link
Jamal
  • 35.2k
  • 13
  • 134
  • 238

First Scala Program - Reading and parsing CSV Parsing - is it idiomatic?files

This is my first real attempt at a scalaScala program. I come from a predominantly javaJava background, so I'd like to know if the program sticks to scalaScala conventions well.

Is it well readable or should it be formulated differently.? To me, there is a lot of lines in the main function which doesn't quite bode right.

If interested, the unit tests are here. Their not for review.:

package com.wesleyacheson

import org.junit._
import Assert._
import scala.io.Source

class CsvReader2Test {

   val simpleSource = Source.fromString("""abc,def,ghi
 jkl, zzz""");
   val quotedFields = Source.fromString("foo bar, \"foo bar\"");

   @Test def VerifyReaderPassedMustNotBeNull() {
     val source: Source = null;

     try {
       new CSVReader(source)
       fail("Should have thrown exception")
     } catch {
       case e: IllegalArgumentException => //Expected
     }
   }

   @Test def VerifyReadAllReturnsAStringList() {
     assertTrue("Expected a List[String]", new CSVReader2(simpleSource).readAll().isInstanceOf[List[String]]);
   }

   @Test def VerifyNumberOfReadLinesAreCorrect() {
     assertEquals(2, new CSVReader2(simpleSource).readAll().size)
   }

   @Test def VerifyFirstLineContainsExpectedValues() {
     val firstLine = new CSVReader2(simpleSource).readAll().head
     println(firstLine)
     assertEquals("Checking the number of tokens in the first line", 3, firstLine.size)
     assertTrue("Checking that the line contains " + "abc", firstLine.contains("abc"))
     assertTrue("Checking that the line contains " + "def", firstLine.contains("def"))
     assertTrue("Checking that the line contains " + "ghi", firstLine.contains("ghi"))
     assertFalse("Checking that the line does not contain" + "jkl", firstLine.contains("jkl"))
   }

   @Test def verifyBothQuotedFieldsAreTheSame() {
     val line = new CSVReader2(quotedFields).readAll().head;
     assertEquals("Checking that the number of tokens in the first line", 2, line.size);
     assertEquals("foo bar", line(0));
     assertEquals("foo bar", line(1));
   }

   @Test def verifySimpleQuotedValueIsUnchanged {
     println(new CSVReader2(Source.fromString("\"foo bar\",")).readLine()(0))
     assertEquals("foo bar", new CSVReader2(Source.fromString("\"foo bar\",")).readLine()(0))
   }

   @Test def verifyDoubleQuotesAreConvertedToQuote {
     assertEquals("\"foo bar\"", new CSVReader2(Source.fromString("\"\"foo bar\"\",")).readLine().head)
   }

   @Test def verify3QuotesAreTreatedAsDoubleQuotesWithinSection {
     assertEquals("\"foo bar\"", new CSVReader2(Source.fromString("\"\"\"foo bar\"\"\",")).readLine().head)

   }

   @Test def verifyQuotedCommasAreReturned {
     assertEquals(",", new CSVReader2(Source.fromString("\",\",")).readLine().head)
   }

   @Test def verifyLeadingWhiteSpaceIsRemoved {
       assertEquals("abc  def", new CSVReader2(Source.fromString("   abc  def")).readLine().head)
   }

  @Test def verifyTailingWhiteSpaceIsRemoved {
    assertEquals("1", new CSVReader2(Source.fromString("1    ")).readLine().head)
  }

  @Test def verifyQuotedLeadingWhitespaceIsPreserved {
    assertEquals("   1", new CSVReader2(Source.fromString("\"   1\"")).readLine().head)
  }

  @Test def verifyQuotedTailingWhitespaceIsPreserved {
    assertEquals("1   ", new CSVReader2(Source.fromString("\"1   \"")).readLine().head)
  }

  @Test def verifyQuotedBlankLinesArePreserved{
    assertEquals("first\n\rsecond", new CSVReader2(Source.fromString("\"first\n\rsecond\"")).readLine().head)
  }

  @Test def verifyCellsAreInTheRightOrder{
    val returned = new CSVReader2(Source.fromString("first,second")).readLine()
    assertEquals("first", returned(0));
    assertEquals("second", returned(1));
  }

  @Test def verifyRowsAreInTheRightOrder{
    val returned = new CSVReader2(Source.fromString("first\nsecond")).readAll()
    assertEquals("first", returned(0)(0));
    assertEquals("second", returned(1)(0));
  }

  @Test def verifyNewLineCarriageReturnIsOnlyTreatedAsOneBlankLine{
    val returned = new CSVReader2(Source.fromString("first\n\rsecond")).readAll()
    assertEquals("first", returned(0)(0));
    assertEquals("second", returned(1)(0));
  }

  @Test def verifyReadsUntilFirstQuote {
    assertEquals("abc\"defg", new CSVReader2(Source.fromString("")).readUntilQuote(Source.fromString("abc\"\"defg\"hi\"jklmnop").buffered ))
  }

  @Test def verifyThrowsExceptionIfNoQuoteFound {
    try {
    new CSVReader2(Source.fromString("")).readUntilQuote(Source.fromString("a").buffered)
      fail()
    } catch {
      case e => //expected
    }
  }
 
 }

Edit

What I see when I look at this coming from a java view pointJava viewpoint. String concatenation isn't usually done with ++. However I don't see how to use a string buffer and keep it semi functional.

val finished = {() => (partialToken::tokens).reverse}

val finished = {() => (partialToken::tokens).reverse}

EDIT: I've been told that a lazy valval may mebe more appropiateappropriate for this, and I tend to agree.

It may be better for extendability if I returned a list of token objects (or is this my javaJava head intefereinginterfering?)

I wonder if any traits could be mixed in to make it more rich. If there was a trait dealing with 2 Dimension-dimension tabular data for instance.

First Scala Program - CSV Parsing - is it idiomatic?

This is my first real attempt at a scala program. I come from a predominantly java background, so I'd like to know if the program sticks to scala conventions well.

Is it well readable or should it be formulated differently. To me there is a lot of lines in the main function which doesn't quite bode right.

If interested the unit tests are here. Their not for review.

package com.wesleyacheson

import org.junit._
import Assert._
import scala.io.Source

class CsvReader2Test {

   val simpleSource = Source.fromString("""abc,def,ghi
 jkl, zzz""");
   val quotedFields = Source.fromString("foo bar, \"foo bar\"");

   @Test def VerifyReaderPassedMustNotBeNull() {
     val source: Source = null;

     try {
       new CSVReader(source)
       fail("Should have thrown exception")
     } catch {
       case e: IllegalArgumentException => //Expected
     }
   }

   @Test def VerifyReadAllReturnsAStringList() {
     assertTrue("Expected a List[String]", new CSVReader2(simpleSource).readAll().isInstanceOf[List[String]]);
   }

   @Test def VerifyNumberOfReadLinesAreCorrect() {
     assertEquals(2, new CSVReader2(simpleSource).readAll().size)
   }

   @Test def VerifyFirstLineContainsExpectedValues() {
     val firstLine = new CSVReader2(simpleSource).readAll().head
     println(firstLine)
     assertEquals("Checking the number of tokens in the first line", 3, firstLine.size)
     assertTrue("Checking that the line contains " + "abc", firstLine.contains("abc"))
     assertTrue("Checking that the line contains " + "def", firstLine.contains("def"))
     assertTrue("Checking that the line contains " + "ghi", firstLine.contains("ghi"))
     assertFalse("Checking that the line does not contain" + "jkl", firstLine.contains("jkl"))
   }

   @Test def verifyBothQuotedFieldsAreTheSame() {
     val line = new CSVReader2(quotedFields).readAll().head;
     assertEquals("Checking that the number of tokens in the first line", 2, line.size);
     assertEquals("foo bar", line(0));
     assertEquals("foo bar", line(1));
   }

   @Test def verifySimpleQuotedValueIsUnchanged {
     println(new CSVReader2(Source.fromString("\"foo bar\",")).readLine()(0))
     assertEquals("foo bar", new CSVReader2(Source.fromString("\"foo bar\",")).readLine()(0))
   }

   @Test def verifyDoubleQuotesAreConvertedToQuote {
     assertEquals("\"foo bar\"", new CSVReader2(Source.fromString("\"\"foo bar\"\",")).readLine().head)
   }

   @Test def verify3QuotesAreTreatedAsDoubleQuotesWithinSection {
     assertEquals("\"foo bar\"", new CSVReader2(Source.fromString("\"\"\"foo bar\"\"\",")).readLine().head)

   }

   @Test def verifyQuotedCommasAreReturned {
     assertEquals(",", new CSVReader2(Source.fromString("\",\",")).readLine().head)
   }

   @Test def verifyLeadingWhiteSpaceIsRemoved {
       assertEquals("abc  def", new CSVReader2(Source.fromString("   abc  def")).readLine().head)
   }

  @Test def verifyTailingWhiteSpaceIsRemoved {
    assertEquals("1", new CSVReader2(Source.fromString("1    ")).readLine().head)
  }

  @Test def verifyQuotedLeadingWhitespaceIsPreserved {
    assertEquals("   1", new CSVReader2(Source.fromString("\"   1\"")).readLine().head)
  }

  @Test def verifyQuotedTailingWhitespaceIsPreserved {
    assertEquals("1   ", new CSVReader2(Source.fromString("\"1   \"")).readLine().head)
  }

  @Test def verifyQuotedBlankLinesArePreserved{
    assertEquals("first\n\rsecond", new CSVReader2(Source.fromString("\"first\n\rsecond\"")).readLine().head)
  }

  @Test def verifyCellsAreInTheRightOrder{
    val returned = new CSVReader2(Source.fromString("first,second")).readLine()
    assertEquals("first", returned(0));
    assertEquals("second", returned(1));
  }

  @Test def verifyRowsAreInTheRightOrder{
    val returned = new CSVReader2(Source.fromString("first\nsecond")).readAll()
    assertEquals("first", returned(0)(0));
    assertEquals("second", returned(1)(0));
  }

  @Test def verifyNewLineCarriageReturnIsOnlyTreatedAsOneBlankLine{
    val returned = new CSVReader2(Source.fromString("first\n\rsecond")).readAll()
    assertEquals("first", returned(0)(0));
    assertEquals("second", returned(1)(0));
  }

  @Test def verifyReadsUntilFirstQuote {
    assertEquals("abc\"defg", new CSVReader2(Source.fromString("")).readUntilQuote(Source.fromString("abc\"\"defg\"hi\"jklmnop").buffered ))
  }

  @Test def verifyThrowsExceptionIfNoQuoteFound {
    try {
    new CSVReader2(Source.fromString("")).readUntilQuote(Source.fromString("a").buffered)
      fail()
    } catch {
      case e => //expected
    }
  }
 
 }

Edit

What I see when I look at this coming from a java view point. String concatenation isn't usually done with +. However I don't see how to use a string buffer and keep it semi functional.

val finished = {() => (partialToken::tokens).reverse}

EDIT: I've been told that a lazy val may me more appropiate for this I tend to agree.

It may be better for extendability if I returned a list of token objects (or is this my java head intefereing?)

I wonder if any traits could be mixed in to make it more rich. If there was a trait dealing with 2 Dimension tabular data for instance.

Reading and parsing CSV files

This is my first real attempt at a Scala program. I come from a predominantly Java background, so I'd like to know if the program sticks to Scala conventions well.

Is it well readable or should it be formulated differently? To me, there is a lot of lines in the main function which doesn't quite bode right.

If interested, the unit tests are here:

package com.wesleyacheson

import org.junit._
import Assert._
import scala.io.Source

class CsvReader2Test {

   val simpleSource = Source.fromString("""abc,def,ghi
 jkl, zzz""");
   val quotedFields = Source.fromString("foo bar, \"foo bar\"");

   @Test def VerifyReaderPassedMustNotBeNull() {
     val source: Source = null;

     try {
       new CSVReader(source)
       fail("Should have thrown exception")
     } catch {
       case e: IllegalArgumentException => //Expected
     }
   }

   @Test def VerifyReadAllReturnsAStringList() {
     assertTrue("Expected a List[String]", new CSVReader2(simpleSource).readAll().isInstanceOf[List[String]]);
   }

   @Test def VerifyNumberOfReadLinesAreCorrect() {
     assertEquals(2, new CSVReader2(simpleSource).readAll().size)
   }

   @Test def VerifyFirstLineContainsExpectedValues() {
     val firstLine = new CSVReader2(simpleSource).readAll().head
     println(firstLine)
     assertEquals("Checking the number of tokens in the first line", 3, firstLine.size)
     assertTrue("Checking that the line contains " + "abc", firstLine.contains("abc"))
     assertTrue("Checking that the line contains " + "def", firstLine.contains("def"))
     assertTrue("Checking that the line contains " + "ghi", firstLine.contains("ghi"))
     assertFalse("Checking that the line does not contain" + "jkl", firstLine.contains("jkl"))
   }

   @Test def verifyBothQuotedFieldsAreTheSame() {
     val line = new CSVReader2(quotedFields).readAll().head;
     assertEquals("Checking that the number of tokens in the first line", 2, line.size);
     assertEquals("foo bar", line(0));
     assertEquals("foo bar", line(1));
   }

   @Test def verifySimpleQuotedValueIsUnchanged {
     println(new CSVReader2(Source.fromString("\"foo bar\",")).readLine()(0))
     assertEquals("foo bar", new CSVReader2(Source.fromString("\"foo bar\",")).readLine()(0))
   }

   @Test def verifyDoubleQuotesAreConvertedToQuote {
     assertEquals("\"foo bar\"", new CSVReader2(Source.fromString("\"\"foo bar\"\",")).readLine().head)
   }

   @Test def verify3QuotesAreTreatedAsDoubleQuotesWithinSection {
     assertEquals("\"foo bar\"", new CSVReader2(Source.fromString("\"\"\"foo bar\"\"\",")).readLine().head)

   }

   @Test def verifyQuotedCommasAreReturned {
     assertEquals(",", new CSVReader2(Source.fromString("\",\",")).readLine().head)
   }

   @Test def verifyLeadingWhiteSpaceIsRemoved {
       assertEquals("abc  def", new CSVReader2(Source.fromString("   abc  def")).readLine().head)
   }

  @Test def verifyTailingWhiteSpaceIsRemoved {
    assertEquals("1", new CSVReader2(Source.fromString("1    ")).readLine().head)
  }

  @Test def verifyQuotedLeadingWhitespaceIsPreserved {
    assertEquals("   1", new CSVReader2(Source.fromString("\"   1\"")).readLine().head)
  }

  @Test def verifyQuotedTailingWhitespaceIsPreserved {
    assertEquals("1   ", new CSVReader2(Source.fromString("\"1   \"")).readLine().head)
  }

  @Test def verifyQuotedBlankLinesArePreserved{
    assertEquals("first\n\rsecond", new CSVReader2(Source.fromString("\"first\n\rsecond\"")).readLine().head)
  }

  @Test def verifyCellsAreInTheRightOrder{
    val returned = new CSVReader2(Source.fromString("first,second")).readLine()
    assertEquals("first", returned(0));
    assertEquals("second", returned(1));
  }

  @Test def verifyRowsAreInTheRightOrder{
    val returned = new CSVReader2(Source.fromString("first\nsecond")).readAll()
    assertEquals("first", returned(0)(0));
    assertEquals("second", returned(1)(0));
  }

  @Test def verifyNewLineCarriageReturnIsOnlyTreatedAsOneBlankLine{
    val returned = new CSVReader2(Source.fromString("first\n\rsecond")).readAll()
    assertEquals("first", returned(0)(0));
    assertEquals("second", returned(1)(0));
  }

  @Test def verifyReadsUntilFirstQuote {
    assertEquals("abc\"defg", new CSVReader2(Source.fromString("")).readUntilQuote(Source.fromString("abc\"\"defg\"hi\"jklmnop").buffered ))
  }

  @Test def verifyThrowsExceptionIfNoQuoteFound {
    try {
    new CSVReader2(Source.fromString("")).readUntilQuote(Source.fromString("a").buffered)
      fail()
    } catch {
      case e => //expected
    }
  }
 }

What I see when I look at this coming from a Java viewpoint. String concatenation isn't usually done with +. However I don't see how to use a string buffer and keep it semi functional.

val finished = {() => (partialToken::tokens).reverse}

I've been told that a lazy val may be more appropriate for this, and I tend to agree.

It may be better for extendability if I returned a list of token objects (or is this my Java head interfering?)

I wonder if any traits could be mixed in to make it more rich. If there was a trait dealing with 2-dimension tabular data for instance.

Tweeted twitter.com/#!/StackCodeReview/status/363944985558646785
changed title to more sucintly reflect what I want from a review. (and to bump)
Source Link
Athas
  • 196
  • 1
  • 12

First Scala Program - CSV Parsing - Doesis it stick to conventionsidiomatic?

EDIT: I've been told that a lazy val may me more appropiate for this I tend to agree.

First Scala Program - CSV Parsing - Does it stick to conventions?

First Scala Program - CSV Parsing - is it idiomatic?

EDIT: I've been told that a lazy val may me more appropiate for this I tend to agree.

Added my own point of view. Fixed Spelling.
Source Link
Athas
  • 196
  • 1
  • 12

This is my first real attempt at a scala program. I come from a predomenantlypredominantly java background, so I'd like to know if the program sticks to scala conventions well.

Is it well readable or should it be formulated differntlydifferently. To me there is a lot of lines in the main function which doesn't quite bode right.

It is functionalfunctioning and I can provide my junitJUnit tests to prove it. Not all tests pass as those dealing with whitespace are now differnentdifferent.

If intrestedinterested the unit tests are here. Their not for review.

Edit

What I see when I look at this coming from a java view point. String concatenation isn't usually done with +. However I don't see how to use a string buffer and keep it semi functional.

The program isn't a functional program. I don't think this could have been avoided using a source. Pure functional programming would have meant that I'd have to make concessions like converting the entire input to a string outside and passing that in which isn't practical.

I don't like the argument-less anonymous function but I don't know what else to do for it.

val finished = {() => (partialToken::tokens).reverse}

It may be better for extendability if I returned a list of token objects (or is this my java head intefereing?)

I wonder if any traits could be mixed in to make it more rich. If there was a trait dealing with 2 Dimension tabular data for instance.

The buffered iterator was a bit of a cheat. My original was far longer until I added that. Feels like maybe I've skipped a bit of learning for the sake of convenience.

This is my first real attempt at a scala program. I come from a predomenantly java background, so I'd like to know if the program sticks to scala conventions well.

Is it well readable or should it be formulated differntly. To me there is a lot of lines in the main function which doesn't quite bode right.

It is functional and I can provide my junit tests to prove it. Not all tests pass as those dealing with whitespace are now differnent.

If intrested the unit tests are here. Their not for review.

This is my first real attempt at a scala program. I come from a predominantly java background, so I'd like to know if the program sticks to scala conventions well.

Is it well readable or should it be formulated differently. To me there is a lot of lines in the main function which doesn't quite bode right.

It is functioning and I can provide my JUnit tests to prove it. Not all tests pass as those dealing with whitespace are now different.

If interested the unit tests are here. Their not for review.

Edit

What I see when I look at this coming from a java view point. String concatenation isn't usually done with +. However I don't see how to use a string buffer and keep it semi functional.

The program isn't a functional program. I don't think this could have been avoided using a source. Pure functional programming would have meant that I'd have to make concessions like converting the entire input to a string outside and passing that in which isn't practical.

I don't like the argument-less anonymous function but I don't know what else to do for it.

val finished = {() => (partialToken::tokens).reverse}

It may be better for extendability if I returned a list of token objects (or is this my java head intefereing?)

I wonder if any traits could be mixed in to make it more rich. If there was a trait dealing with 2 Dimension tabular data for instance.

The buffered iterator was a bit of a cheat. My original was far longer until I added that. Feels like maybe I've skipped a bit of learning for the sake of convenience.

Source Link
Athas
  • 196
  • 1
  • 12
Loading