logo

Home page
Articles for Windows, Linux, OS X
Mac tips and articles
Mac tips
Windows 8 tips and articles
Windows 7 tips and articles
Vista Tips
XP Tips
Linux tips and articles
Read the blog
Online store
Windows, Linux, OS X programs
Links
About

Read and write .csv files in RealBasic

The .csv file format is a universally understood one that has been in use for so long that it is hard to remember who actually created it and when. When data is saved in .csv format it is saved as plain text using commas and carriage returns to separate each item.

It is so simple that almost all programs that are able to export and import information can do so in .csv format. This makes it perfect for exchanging information between different programs.

Even if you are not trying to exchange data and just want to store information on the hard disk drive, it is often useful to save it in .csv format. Small and simple databases that store everything in memory can store the database on disk as a .csv file and there is no need for complex formats.

The data is stored in a very simple and logical way, and it is quite compact too, with little wasted space. If you lose the original program you will always be able to access the original data because it can be loaded into a spreadsheet or a text editor and viewed or edited.

Let's take a look at a simple example of a .csv file:

Fruit,People,Places,Planets,Numbers
Apple,John Smith,London,Mercury,12345
Banana,Bob Jones,New York,Venus,987654
Orange,Susan Brown,Paris,Saturn,-1
Lemon,Mary Moore,Moscow,Jupiter,3.142

There are five lines of data and five items on each line. Of course, it doesn't have to be 5 x 5 and I could have typed in more - or less. You can store as many items on each line as you want and the number of lines is equal to the amount of data you want to store. You could have 1,000 or even 10,000 lines if you wanted to and there aren't any limits.

The example .csv file above is a simple database and there are four records. Each record has five fields and the name of each field is stored in the first line. This is optional and some .csv files have the field names in the first line and some don't. It's up to you whether you want to store the field names or not.

Writing .csv files
RealBasic csv exampleA RealBasic could store information in an array or it could store it in a ListBox control. This is a grid of rows and columns that is like the cells in a spreadsheet or the table view of a database.

You will often want to save the information stored in a ListBox to disk and then load it again the next time the program is run, so this is what is used in the example program.

If you want to save the contents of an array and load it back again, it is quite easy to modify the program code

The program can be downloaded from here and loaded into Realbasic so you can compile it and run it yourself.

The only code that is important is the code on the Load and Save buttons. The rest is trivial and you can examine the program yourself.

So let's look at the code on the Save button first. This writes the data in the ListBox grid to the hard disk drive.

RealBasic .csv write

First we show the standard file selector and get the name of the file to save to. Of course, you could set the filename in code, but in this example the user is asked where to save the file and what to call it:

f = GetSaveFolderItem("text/plain","Test.csv")

Now the file is created and it's a plain text file and a text output stream is used:

tos = f.CreateTextFile

You'll see some error checking in the code, for example, you need to test whether the user clicked the Cancel button when saving the file or whether the file could not be created for some reason. The next bit of code, the bit that writes the .csv data, is wrapped in an if...then in order to check that there is actually some data in the grid to save!

if ListBox1.ListCount>0 then            'if grid not empty
  for i=0 to ListBox1.ListCount-1       'for each row
    s=""                                'build line to save
    for j=0 to ListBox1.ColumnCount-1   'for each column
      s=s+ListBox1.Cell(i,j)+ ","       'csv
    next
    tos.WriteLine s.left(s.len-1)       'save line
  next
end if

It's really simple and all that you need to do is to build the line that is to be saved in a string. The string variable s is set to "" (nothing) and then each item of data in the current tow is added - with a comma in between each one. The string is written to disk when all the data for the row has been added. That's it. The data is stored in a .csv file.

Reading .csv files
How do we read the data back into the ListBox grid? Let's look at the code on the Load button in the example program. It's fairly straightforward and quite short and simple:

RealBasic .csv write

The first task is to show a standard file selector and ask the user to select the file that the data was saved to. Of course, this step is optional and you could write the file name and path in the code to allow the program to load and save data without the user being asked:

f = GetOpenFolderItem("any")

Eagle eyed readers will notice that the file was saved as a 'text/plain' file type, but now the 'any' type is used. This is because other applications can take over control of a file and change it's type. When this happens you may find it cannot be opened as a text/plain file. For exmple, opening the .csv file in TextEdit on my Apple Mac, adding or editing data and then saving it again turns it into a NeoOffice document! Double click the .csv file and the spreadsheet imports it! Weird!

The file is opened as a text file:

tis = f.OpenAsTextFile

Then we can read the contents of the file. One line is read at a time and the RealBasic Split(s,",") function is used to place each item or field into the fields() array. This chops the string at each comma and places each chunk into a separate element of the fieds() array. It's really useful.

while not tis.EOF                    'while not end-of-file
  ListBox1.AddRow ""                 'add row to grid
  s=tis.ReadLine                     'read line from file
  fields=Split(s,",")                'put items in fileds() array
  for i=0 to ListBox1.ColumnCount-1  'copy to grid
    ListBox1.Cell(ListBox1.ListCount-1,i)=Trim(fields(i))
  next
wend

Quoted strings
What we have looked at so far is the simplest case of a .csv file and it is always best to keep things as simple as possible. An obvious problem would occur if the data being saved contained commas. For example, in an address database, the address might be stored as 123, Fleet Street, London. There are two commas in there and when the data was read back the Split() command would split on each comma as if they were separate fields rather than one address field. This would cause the wrong data to be written to the wrong cells in the ListBox grid.

The way that .csv files overcome this problem is by enclosing strings like this in quotes, so the address would be stored as "123, Fleet Street, London". When it is read back in from disk, the quotes should be removed and the whole string treated as one field.

There are two ways of overcoming the comma-in-fields problem and you could write your own function to replace RealBasic's Split(). It would need to chop the line read from disk into chunks and store them in the fields() array. I'm not going to do that though, I'm going to cheat! Originally in the save code we built up the string to save like this:

s=s+ListBox1.Cell(i,j)+","

In order to cope with commas in the data, we can simply replace them with some other character - something that isn't normally used, like this:

s=s+ReplaceAll(ListBox1.Cell(i,j),",","|")+","

In this case we replace every comma with a vertical bar character - |. You could use any character of course, and I use | because it's not part of the alphabet or punctuation, so people don't normally type it in. When the file is opened for reading all we have to do is to replace the | with commas again. So in the code the read the data back in we replace this line:

ListBox1.Cell(ListBox1.ListCount-1,i)=Trim(fields(i))

with this one:

ListBox1.Cell(ListBox1.ListCount-1,i)=Trim(ReplaceAll(fields(i),"|",","))

This would make our .csv file non-standard, but if you are only reading and writing your own data files, this doesn't matter. Whether you cheat and replace commas on writing and reading or you create your own Split() function is up to you.


copyright