Javanomicon01 - Case Study 10 - Electronic Journal

From Monkeys @ Keyboards
Jump to: navigation, search

Case Study 10 - Electronic Journal

Previous TOC Next
Javanomicon01 - Odds and Ends The Javanomicon Javanomicon01 - Moving from Java to C#

Introduction

In this case study we're going to pull together a large amount of what we've learned during the course of this book into a fairly complex, stand-alone application. We're also going to look at some of the performance issues concerning the application and see which of the optimisation tips we discussed in chapter 14 (Efficiency and Optimisation) can usefully be applied.

The application we're going to develop is an electronic journal - a diary of sorts that allows users to store their comments on a day to day basis and save them to an external data file using random access based file IO.

We'll allow the user to retrieve files by date, and also allow them to search through all their entries for specific keywords.

The techniques we need to utilise to build this application are quite varied, and it should prove to be a good way for us to stretch our programming muscles a little for our final Java-based case study.

The Interface

As always, we begin with designing our interface. Most of our peripheral functionality is going to be accessed via a menu, but we'll need a button that lets us submit a day's entry. We'll also need a text area that allows the user to enter their desired information, and a label that displays the current date. The date handling for this particular application will give us cause to investigate a fairly esoteric class (GregorianCalender), but for starters we will just make use of the standard Date that we have seen previously.

The interface for our application looks like this:

33.1: Application Storyboard

And the code for our interface:

<source lang = "java">

import java.awt.*; import java.awt.event.*; import javax.swing.*; import java.util.*;

public class JournalInterface extends JFrame implements ActionListener {

   JButton submit;
   JTextArea entryArea;
   JLabel currentDate;
   JMenuBar applicationMenu;
   public JournalInterface() {
       Date time = new Date();
       submit = new JButton("Submit Entry");
       submit.addActionListener(this);
       add(submit, BorderLayout.SOUTH);
       entryArea = new JTextArea(10, 10);
       add(entryArea, BorderLayout.CENTER);
       currentDate = new JLabel("The current date is " + time);
       add(currentDate, BorderLayout.NORTH);
       applicationMenu = new JMenuBar();
       setJMenuBar(applicationMenu);
   }
   public static void main(String args[]) {
       JournalInterface main = new JournalInterface();
       main.setSize(400, 400);
       main.setTitle("Electronic Journal");
       main.setVisible(true);
   }
   public void actionPerformed(ActionEvent e) {
   }

}

</source>

It's all the usual suspects that have been lined up for our interface - the only new bit is using a Date object to display the current date - we'll talk about the implications of this a little bit further into the chapter.

We haven't added anything to our menu yet - we'll be adding each individual entry as our requirements dictate.

Step Two: Data Representation

Here we come to the thorny part of this particular application - how are we going to store the data? We are going to have strings of text associated with dates, so a HashMap seems like an appropriate way of setting up that one-to-one relationship. However, it's perfectly possible that someone will make more than one entry on a particular day, so we need some way of making sure that one entry doesn't overwrite another.

We can deal with this in a few ways - there's no right or wrong approach, it's all a matter of individual preference:

  • Store journal entries in an ArrayList, meaning that each date is associated with an ArrayList of relevant entries.
  • Automatically append existing entries with the text 'update' followed by the text of the new entry.
  • Automatically load up the day's existing entry when the user opens the application, putting the emphasis for revisions onto them.

The latter sounds like a good idea - that way the user isn't forced into some artificial standard that they don't like, and the interface doesn't become unnecessarily complicated by forcing the user to switch between multiple entries for days.

So, we have to ask ourselves if a HashMap is indeed the best structure for this kind of application. Well, let's have a think about that.

HashMaps are ideal for quick access to information - we don't need to search through every element in an array to find the entry we are interested in. But are we going to be storing enough elements for that to be a particular issue? After all, we'll be storing at most 366 entries for a year - even after ten years, with an entry being made every single day, we'll have less than 3660 entries, and performing a search for particular dates is going to be an irregular operation.

HashMaps are weaker in the support they provide for developers - it is more complex to iterate over every element in a HashMap object due to the way the data is represented internally. We're going to be doing that when we search for keywords within all existing entries.

Perhaps then a HashMap isn't the perfect solution. You may disagree, and that's fine - an alternate implementation would indeed work perfectly well using a HashMap, and you could investigate this kind of data storage as an extra exercise.

But back to the case study - how about if we create a separate class for journal entries? That way, if we need to we can expand the information we store about each entry:

<source lang = "java"> </source>


Note that we provide get and set methods for journalEntry, but only a get method for the date - we don't want to ever change this once an entry has been set so we don't provide a facility for it.

Note also the use of the unfamiliar Calender object as an attribute of the class. We'll come to this in a later section of the chapter.

Now that we have a class for each individual entry, we should write a class that represents the journal as a whole. This class will be made up of an ArrayList of instances of our JournalEntry class: one Journal will contain many JournalEntries. In database terminology, this would be a one to many relationship.

The code for our Journal class is as follows:

<source lang = "java"> import java.util.*;

public class JournalEntry {

   private Calendar timeOfEntry;
   private String journalEntry;
   public JournalEntry(Calendar c, String j) {
       timeOfEntry = c;
       journalEntry = j;
   }
   public String getEntry() {
       return journalEntry;
   }
   public void setEntry(String str) {
       journalEntry = str;
   }
   public Calendar getDate() {
       return timeOfEntry;
   }

} </source>

This is a very simple framework, and we will need to add elements to this as we go along. For the moment, we'll work incrementally and add code as we need it. Our interface is going to make use of this Journal class rather than implement the JournalEntry ArrayList internally. The rationale for this is similar to that which was discussed in chapter 12 when we broke a simple password applet into several different classes to ensure portability, maintainability and suitable encapsulation.

At the moment we just have the raw get and add methods - but getEntry takes an integer parameter that indicates which entry we are to get - so to use this method we need to know the index of the entry, which isn't much use to the user. We also need a way to find a particular entry by date, so we need to write a findEntry method.

Now, we've seen the Date object before, but it's too precise for our needs - we don't want to compare the date of entries to the nearest millisecond - all we want is a comparison to day, month and year. The Date object does contain get methods for each element of the time, but these are deprecated and unlikely to be supported in future versions of the JDK. We can use them, but it's really not a good idea. If we attempt to make use of these deprecated methods, Java will spit out a warning when we compile. Checking the documentation for the deprecated methods indicates the preferred way of performing this kind of check.

Java provides a Calendar class that works in much the same way as the Date class- we create an instance of the class to get an object that points to the current time. We then use the get method of the object to extract elements of that time... we pass in constant values to indicate which part of the time we want:

33.2: Calendar Constants

So rather than use a Date object, we're going to search against a Calendar object:

<source lang = "java">

   public int findEntry(Calendar d) {
       Calendar tmpDate;
       int day;
       int month;
       int year;
       for (JournalEntry tmp : allEntries) {
           tmpDate = tmp.getDate();
           day = tmpDate.get(Calendar.DAY_OF_WEEK);
           month = tmpDate.get(Calendar.MONTH);
           year = tmpDate.get(Calendar.YEAR);
           if (day == d.get(Calendar.DAY_OF_WEEK)
                   && month == d.get(Calendar.MONTH)
                   && year == d.get(Calendar.YEAR)) {
               return i;
           }
       }
       return -1;
   }

</source>

If we find a journal entry that matches our search criteria, we return the index at which it was found. Otherwise, we return the value -1. In this way we can check for the existence of an entry, and its location, in one compact method call.

So, with this in mind, we can adapt our addEntry method to ensure that we add an entry only if an entry for that date doesn't currently exist. If there is an entry for that date, we don't need to create a new instance of JournalEntry - we can simply change the text of the existing entry... effectively it becomes an add or update method:

<source lang = "java">

   public void addEntry(String text, Calendar d) {
       int i;
       JournalEntry tmp;
       i = findEntry(d);
       if (i == -1) {
           tmp = new JournalEntry(d, text);
           allEntries.add(tmp);
       } else {
           tmp = getEntry(i);
           tmp.setEntry(text);
       }
   }

</source>

And finally (for now) we need an interface method that allows us to link together getEntry and findEntry into one user-friendly, holistic whole:

<source lang = "java">

   public JournalEntry getEntryForDate(Calendar c) {
       int i = findEntry(c);
       JournalEntry tmp;
       if (i == -1) {
           return null;
       }
       tmp = getEntry(i);
       return tmp;
   }

</source>

Now, let's leave the coding of our Journal class for a moment, and spend some time hooking it into our interface.


Step Three: Tying Them Together

As usual, most of the trigger for the functionality is going to be based on our actionPerformed method within our interface, but some of the functionality is also going to be in the constructor method. We'll look at the functionality connected with the submit button first of all.

When the user presses the button, we want to take a copy of what text they have written, and add an entry to our journal at the current date. Step one is that we need a Journal object, which we'll declare with class-wide scope under the name myJournal.

We also need to create a Calendar object that points to the current date. However, here we hit a problem - the base Calendar object is defined as being abstract - it cannot be instantiated. Curses! Who would have thought that the Java designers would confound our search for knowledge by planting such horrendous bombshells throughout the API? They are tricksy little hobbits, one and all!

We can instantiate a concrete implementation of the Calender class (this concrete implementation is called GregorianCalendar) and harness the Awesome Cosmic Power of polymorphism to ensure it interacts correctly with the rest of our application:

<source lang = "java">


import java.awt.*; import java.awt.event.*; import javax.swing.*; import java.util.*;

public class JournalInterface extends JFrame implements ActionListener {

   ...
   Journal myJournal;
   ...
   
   public JournalInterface() {
       ...
       myJournal = new Journal();
   
       ...
   }
   ...

public void actionPerformed(ActionEvent e) {

       String str;
       Calendar c;
       if (e.getSource() == submit) {
           str = entryArea.getText();
           if (str.length() == 0) {
               JOptionPane.showMessageDialog(null,
                       "You haven't made an entry!");
               return;
           }
           c = new GregorianCalendar();
           myJournal.addEntry(str, c);
       }
   }

} </source>

So, now we can press the submit button to add entries to our data structure - nifty! Now, we need to implement saving and loading, which we'll also deal with in our Journal class. So, let's hurry on back to that!

Step Four: File I/O

We're going to use a default data file for the journal - it will just load and save to the working directory making use of a file called journal.dat. But we need to make another decision here, one that is very similar to the question of which data structure we would make use of. Here, we need to decide what kind of file I/O infrastructure we are going to make use of.

We have two choices - a random access file or a stream based file. We just need to work out which is more appropriate for this particular application.

  • Random access is faster to access individual elements, but can be inconvenient to manipulate when stepping over all records in a file.
  • Stream based IO is easier to parse and search as a single entity, but larger data files can be very inefficient.

Premature optimisation is a bad idea, but random access files seem to offer benefits not provided by stream based IO - after all, we're only going to be accessing individual elements at a time and only stepping over all of them when we search. Our searching won't necessarily be any slower, it's just that the code will be just a little more complex.

However, in order to make use of random access files, we need to enforce a certain size on each of the entries since we must pad or restrict strings to a certain deterministic size. This could be inconvenient for users with very active lives or lots to say about nothing... but if we set the maximum size as something extremely large (say - 20,000 characters), it won't be too much of a burden. Plus, we've never used random access files in a case study, so let's give it a try.

Step one is to define the format for our file:


int day [4 bytes] : day of entry<br>
int month [4 bytes] : month of entry<br>
int year [4 bytes] : year of entry<br>
String text [20002 bytes] : journal text<br>
<br>
Record size = 20014 bytes.<br>

With this file structure in mind, let's implement a method that will create a RandomAccessFile and save the current contents of our ArrayList into separate records. First we need to create our RandomAccessFile:

<source lang = "java"> RandomAccessFile myFile; </source>

Then we need to open it (catching the exception that it can throw):

<source lang = "java">

           try {
               myFile = new RandomAccessFile("journal.dat", "rw");
           } catch (IOException ex) {
               System.out.println("There has been a problem opening the file.");
           }

</source>

Then we get each element in turn from the ArrayList, and write each attribute of the JournalEntry object to the file:

<source lang = "java">

           for (JournalEntry tmp : allEntries) {
               tmpDate = tmp.getDate();
               day = tmpDate.get(Calendar.DAY_OF_WEEK);
               month = tmpDate.get(Calendar.MONTH);
               year = tmpDate.get(Calendar.YEAR);
               text = tmp.getEntry();
               try {
                   myFile.seek(myFile.length());
                   myFile.writeInt(day);
                   myFile.writeInt(month);
                   myFile.writeInt(year);
                   myFile.writeUTF(text);
               } catch (IOException ex) {
                   System.out.println("There has been a problem writing to the file.");
               }
               try {
                   myFile.close();
               } catch (IOException ex) {
                   System.out.println("There has been a problem writing to the file.");
               }
           }

</source>

Before we can use this method, we need to change our JournalEntry class a little to ensure that no entry goes over the maximum size. We'll add a new method that does this for us:

<source lang = "java">

   public String paddedString(String str) {
       if (str.length() > 20000) {
           str = str.substring(0, 20000);
       } else {
           while (str.length() < 20000) {
               str = str + " ";
           }
       }
       return str;
   }

</source>

And then we can make use of this new method whenever we set the text for our entry, which is in the constructor method and in the setEntry method for our JournalEntry:

<source lang = "java">

import java.util.*;

public class JournalEntry {

   private Calendar timeOfEntry;
   private String journalEntry;
   public JournalEntry(Calendar c, String j) {
       timeOfEntry = c;
       journalEntry = paddedString(j);
   }
   public String getEntry() {
       return journalEntry;
   }
   public void setEntry(String str) {
       journalEntry = paddedString(str);
   }
   public Calendar getDate() {
       return timeOfEntry;
   }
   public String paddedString(String str) {
       if (str.length() > 20000) {
           str = str.substring[0.20000];
       } else {
           while (str.length() < 20000) {
               str = str + " ";
           }
       }
   }

} </source>

Now we can safely call our saveEntries method whenever we need to save all of our records to a file.

However, we also need a method that will read the data file back into our Journal. This will also be located in our Journal class. The procedure for this is very similar to the procedure for saving the file. We create an instance of the RandomAccessFile class, and then determine the number of records (which is the length of the file divided by the size of each record).

We then iterate the required number of times, using the seek method to find the starting point of the next record at each stage of the loop. We then read each element of information into a temporary variable before creating a new JournalEntry object and adding it to our allEntries ArrayList:

<source lang = "java">

   public void loadEntries() {
       RandomAccessFile myFile = null;
       int recordSize = 20014;
       long numRecords = 0;
       int year, month, day;
       String text;
       Calendar c;
       JournalEntry tmp;
       try {
           myFile = new RandomAccessFile("journal.dat", "r");
           numRecords = myFile.length() / recordSize;
           myFile.seek(0);
       } catch (IOException ex) {
           System.out.println("There has been a problem opening the file.");
           return;
       }
       for (int i = 0; i < numRecords; i++) {
           try {
               myFile.seek(i * recordSize);
               System.out.println("Reading record " + i);
               day = myFile.readInt();
               month = myFile.readInt();
               year = myFile.readInt();
               text = myFile.readUTF();
               c = new GregorianCalendar(year, month, day);
               tmp = new JournalEntry(c, text);
               allEntries.add(tmp);
           } catch (IOException ex) {
               System.out.println("There has been a problem reading from the file.");
           }
       }
       try {
           myFile.close();
       } catch (IOException ex) {
           System.out.println("There has been a problem closing the file.");
       }
   }

</source>

Now, all that's left to do is implement the searching... and since we have everything stored in an ArrayList, this is a fairly simple process. We just step over every element in our ArrayList, and store each in a temporary variable. We use the getEntry method to store the text of an entry in a string variable (in this example, called text), and then we use the indexOf method on text to see if the search string exists. If it does, we make a note of the date associated with the entry.

<source lang = "java">

   public String searchForText(String str) {
       String ret = "";
       JournalEntry tmp;
       String text;
       for (JournalEntry tmp : allEntries) {
           text = tmp.getEntry();
           if (text.indexOf(str) != -1) {
               if (text.indexOf(str) != -1) {
                   tmpDate = tmp.getDate();
                   year = tmpDate.get(Calendar.YEAR);
                   month = tmpDate.get(Calendar.MONTH);
                   day = tmpDate.get(Calendar.DAY_OF_WEEK);
                   ret = ret + "Term found in entry for "
                           + day + "/" + month + "/" + year
                           + ".\n";
               }
           }
       }
       return ret;
   }

</source>

Now, we need to tie it all together into something we can effectively use by adding each menu option and implementing the code that goes along with the menu items in our actionPerformed method. We'll add menu options for loading, saving, and searching:

<source lang = "java">

   public void actionPerformed(ActionEvent e) {
       String str;
       String ret;
       Calendar c;
       if (e.getSource() == submit) {
           str = entryArea.getText();
           if (str.length() == 0) {
               JOptionPane.showMessageDialog(null,
                       "You haven't made an entry!");
               return;
           }
           c = new GregorianCalendar();
           myJournal.addEntry(str, c);
       }
       if (e.getSource() == search) {
           str = JOptionPane.showInputDialog(null,
                   "What do you want to search for?");
           ret = myJournal.searchForText(str);
           JOptionPane.showMessageDialog(null, ret);
       }
       if (e.getSource() == open) {
           myJournal.loadEntries();
       }
       if (e.getSource() == save) {
           myJournal.saveEntries();
       }
   }

</source>

And there we go, we've implemented all the required functionality. There is still much tidying up to do - for one thing, a way to browse through previous entries would be useful. Implementing this extra functionality is left as an exercise for the reader.


A Slight Optimisation

As soon as we run this application, a pretty serious performance problem becomes obvious as soon as we hit the submit button. It takes a good six or seven seconds before the application becomes responsive again. The reason for this performance problem can be found in our code for padding the text of a journal entry:

<source lang = "java">

   public String paddedString(String str) {
       if (str.length() > 20000) {
           str = str.substring(0, 20000);
       } else {
           while (str.length() < 20000) {
               str = str + " ";
           }
       }
       return str;
   }

</source>

As has been mentioned previously, string operations in Java are very costly. We need to make this code more efficient, which we can do by making use of a StringBuffer object in place of a simple String:

<source lang = "java">

   public String paddedString(String str) {
       StringBuffer tmp = new StringBuffer(str);
       if (str.length() > 20000) {
           str = str.substring(0, 20000);
       } else {
           while (tmp.length() < 20000) {
               tmp = tmp.append(" ");
           }
       }
       return tmp.toString();
   }

</source>


Instantly, our application becomes much more responsive, just like we'd expect. Append operations with a StringBuffer take far less time than the equivalent append operations to a String. We should make use of the StringBuffer class whenever we are going to need to make a large number of modifications to any string of characters.

Running and using the application shows little in the way of other obvious performance problems, although these will start to appear as more and more entries are stored in the Journal. In the short term, there doesn't seem to be much in the way of obvious defects..

However, when we get to around 4000 entries, we'll start to see a huge lag when we load and save - even if we're only adding a single brand new entry, our saveEntries method will go through every element in the journal and save absolutely everything, even if it hasn't been changed.

Rather than save everything when the save button is pressed, we should instead save only those journal entries that have been modified in some way. We can do this fairly easily by adding a boolean flag to each that indicates if the contents of an entry have changed. Then, before we actually perform our write operation on an entry, we check against this boolean variable to determine if it's necessary. Since we're using random access files, we only need to save those entries that have had their state changed. To implement this, we need to change our JournalEntry class a little:


<source lang = "java">

import java.util.*;

public class JournalEntry {

   private Calendar timeOfEntry;
   private String journalEntry;
   private boolean changed;
   public JournalEntry(Calendar c, String j) {
       timeOfEntry = c;
       journalEntry = paddedString(j);
       setChanged(true);
   }
   public String getEntry() {
       return journalEntry;
   }
   public void setEntry(String str) {
       journalEntry = paddedString(str);
       setChanged(true);
   }
   public Calendar getDate() {
       return timeOfEntry;
   }
   public String paddedString(String str) {
       StringBuffer tmp = new StringBuffer(str);
       if (str.length() > 20000) {
           str = str.substring(0, 20000);
       } else {
           while (tmp.length() < 20000) {
               tmp = tmp.append(" ");
           }
       }
       return tmp.toString();
   }
   public boolean hasChanged() {
       return changed;
   }
   public void setChanged(boolean v) {
       changed = v;
   }

} </source>

And then we also need to change the relevant parts of the saveEntries method in Journal:

<source lang = "java">

       for (int i = 0; i < allEntries.size(); i++) {
           tmp = (JournalEntry) allEntries.get(i);
           if (tmp.hasChanged() == false) {
               continue;
           }
           ...
           
           myFile.writeUTF(text);
           tmp.setChanged(false);
       }

</source>

Now when we call this method, it will only save those entries in our data file that have been changed. Once we've saved their new state, then we make use of the setChanged method to indicate there is no longer a requirement for this entry to be saved in future operations. This is a great improvement in efficiency since we only perform costly read and seek operations when we need to, rather than for every record in the data structure.

These are the two main drags on performance - opening a data file will only be done once per run of the application (usually), so changing this to be more efficient will yield only marginal benefits. We may also gain some small benefits from changing the search method to make use of a StringBuffer for building the return string. Even though searching is likely to be a relatively unusual activity, if the application hangs for many seconds when the user clicks on the option, we need to make some effort to solve that problem:

<source lang = "java">

   public String searchForText(String str) {
       StringBuffer ret = new StringBuffer("");
       Calendar tmpDate;
       String text;
       int year;
       int month;
       int day;
       for (JournalEntry tmp : allEntries) {
           tmp = (JournalEntry) allEntries.get(i);
           text = tmp.getEntry();
           if (text.indexOf(str) != -1) {
               tmpDate = tmp.getDate();
               year = tmpDate.get(Calendar.YEAR);
               month = tmpDate.get(Calendar.MONTH);
               day = tmpDate.get(Calendar.DAY_OF_WEEK);
               ret.append("Term found in entry for "
                       + day + "/" + month + "/" + year
                       + ".\n");
           }
       }
       if (ret.length() == 0) {
           ret.append("No matches.");
       }
       return ret;
   }

</source>

Conclusion

This case study has allowed us to pull together our knowledge of random access files and data structures into an application that requires a fairly intricate combination of the two. It also marks the end of our Java-based case studies, since by now we have explored a very large number of the issues that go along with developing object oriented software.