Handling huge data using ZK

From Documentation
DocumentationSmall Talks2009JulyHandling huge data using ZK
Handling huge data using ZK

Author
Timothy Clare, Technology Evangelist, Potix Corporation
Date
July 06, 2009
Version
ZK 3.6.2


Overview

When your database contains a large amount of data you do not want to load all these items into memory. ZK provides many methods for handling large amounts of data including:


  • Live data scrolling
  • Load on demand Trees
  • Paging components


One of the most convenient ways in which to handle a large amount of data is using a paging mechanism.

Introducing huge data with paging

However, normally the paging mechanism will load all the data into the grid via the model. This Small Talk guides you through a method of handling large amounts (in this case 1,000,000 database entries) of data using a small memory footprint.


The basic premise of the Small Talk is to store a list of users within a database, including their first name, surname and phone number.

Demo

Creating the database

We create a database named livedata and create a table named tblusers. We will provide the following columns, id (INT, AI, PK), first_name (VARCHAR), last_name(VARCHAR) and phone(VARCHAR).

The SQL statement to create the table is provided below


CREATE TABLE tblusers (id int(10) NOT NULL AUTO_INCREMENT, 
                                      first_name varchar(200) NOT NULL, 
                                      last_name varchar(200) NOT NULL, 
                                      phone varchar(20) NOT NULL, PRIMARY KEY (id) ) 
                                      ENGINE=InnoDB DEFAULT CHARSET=utf8;

The database screen

We have introduced a database screen which allows you to quickly setup and test your database connection. This is the first page that you arrive at when running the sample, if you have any problems you can manually edit the DatabaseInteraction class and recompile it.

Interacting with the database

Linking the application with the database is the easy part and involves creating a DAO to handle the interaction. The DAO, in this instance, only consists of functions to insert and retrieve news items from the database. Please refer to the UserDAO for more details.

The connection is handled via the MySQL J/Connector with a new connection being made for each operation. Please note that a new connection for each operation is NOT the best method of implementation, a connection pool would be far more efficient so please do not use the code for production level software. Building the model

The model needs to represent our user utilizing Java’s Object Orientated features, hence we implement a Java class named User which includes getters and setters for every field present in the database.


The code within the User class is included below:

public class User {

	private int _id;
	private String _firstName;
	private String _lastName;
	private String _phone;
	
	public String getFirstName() {
		return _firstName;
	}
	public void setFirstName(String firstName) {
		this._firstName = firstName;
	}
	public String getLastName() {
		return _lastName;
	}
	public void setLastName(String lastName) {
		this._lastName = lastName;
	}
	public String getPhone() {
		return _phone;
	}
	public void setPhone(String phone) {
		this._phone = phone;
	}
	public void setId(int id) {
		this._id = id;
	}
	public int getId() {
		return _id;
	}
}

With the User class implemented along with the functions to retrieve them from the database we need to construct the model.


The PagingListModel and the AbstractPagingListModel

To make the implementation of a paging model easier a base class which we can easily extend was created by me.


AbstractPagingListModel

The AbstractPagingListModel was created by myself (it is included in the source zip) to take most of the hard work away when implementing a paging system which limits the memory use. The AbstractPagingListModel extends the AbstractListModel provided by ZK and hence has to override two functions:


public Object getElementAt(int index)

public int getSize()


These two functions are used to manipulate an internal list which contains the retrieved items from the database. This is not done all at once but on a page per page basis. For example if a database table had 1,000,000 entries and the pageSize was 20, there would only be 20 items in memory at any given time.


The function getElementAt returns an element from a model at a particular index. The getElementAt function accesses a local List which contains the data retrieved from the data source (in the case of this Small Talk it is a database).


It is tempting to use this function to directly access the database and retrieve items one by one, while this is tempting it not the best practice as it would require multiple queries to database and thus be slower than one query.


The getSize function returns the size of the local model, in the case of the example this is 20. It is important that you do not return the total number of items in the database as this would lead to a large amount of memory being used.


Functions we need to override

The AbstractPagingModel requires us to override two functions, these are:


public abstract int getTotalSize();

protected abstract List<T> getPageData(int itemStartNumber, int pageSize);


These two functions provide the model with the data for a given page and retrieve the total amount of items in the database table. This information can come from any data source, the decision is up to us.

The constructor

The constructor takes two integer arguments, startPageNumber and pageSize which are multiplied together to retrieve the item number to start from. This item number is then used in a database query to retrieve the list needed. The database query works by using SQL LIMIT to define the starting point (startPageNumber * pageSize) and the number of Users to retrieve (pageSize).

The constructor code is included below:

public AbstractPagingListModel(int startPageNumber, int pageSize) {
		super();
		this._startPageNumber = startPageNumber;
		this._pageSize = pageSize;
		
		int itemStartNumber = startPageNumber * pageSize;
		_items = getPageData(_itemStartNumber, _pageSize);
	}


Having retrieved the appropriate number of Users we need to store these in memory, hence the amount of Users in memory will only be your pageSize. This functionality is provided by an internal List.

PagingListModel

The PagingListModel overrides the required functions and provides a constructor as follows:

public class PagingListModel extends AbstractPagingListModel<User> {

	private static final long serialVersionUID = 4370353438941246687L;

	
	public PagingListModel(int startPageNumber, int pageSize) {
		super(startPageNumber, pageSize);
	}

	@Override
	protected List<User> getPageData(int itemStartNumber, int pageSize) {
		return DatabaseInformation.dao.selectWithLimit(itemStartNumber, pageSize);
	}

	@Override
	public int getTotalSize() {
		return DatabaseInformation.dao.getUserCount();
	}

}

Building the UI

The basic GUI is very simple and consists of two parts

  1. The button which generates 1,000,000 database entries
  2. The paging area (including Grid and paging components)


Separation of the Paging component and the Grid

One of the most important features of this small talk is the fact that the Paging and Grid component are separated. The interaction between the two is implemented by ourselves and not automatically done by ZK. It is possible to autowire this functionality however, this would require calling the setPaginal function which we have already established is not a good idea in this case.

The markup below is the ZUL code from our index.zul

<window title="Huge data demo"  border="normal" apply="org.zkoss.zklargelivelist.controllers.MainUserController">
	
	<label value="If you need to generate 1,000,000 entries please press this button!" />
	<button id="btnGenerate" label="Generate database entries" />
	
	<separator />
	
	<paging id="userPaging"/>
	<grid id="dataGrid">
		<columns>
			<column label="id" width="60px"/>
			<column label="firstName"/>
			<column label="lastName"/>
			<column label="phone"/>
		</columns>
	</grid>
</window>

Please note that the Window component uses the MainUserController to respond to any events fired by the GUI. The controller is discussed in the next section.

Controller functionality

The controller handles two GUI events and overrides the doAfterCompose method. Firstly let us deal with the doAfterCompose method and explain what it does. The method is called after the components have been initialized. This allows us to manipulate the components before the user interacts with them.


doAfterCompose

public void doAfterCompose(Component comp) throws Exception {
		super.doAfterCompose(comp);

		dataGrid.setRowRenderer(new RowRenderer(){
			@SuppressWarnings("unchecked")
			public void render(Row row, Object data) throws Exception {
				User usr = (User) data;
				row.getChildren().add(new Label(usr.getId()+""));
				row.getChildren().add(new Label(usr.getFirstName()));
				row.getChildren().add(new Label(usr.getLastName()));
				row.getChildren().add(new Label(usr.getPhone()));
			}});
		
		refreshModel(_startPageNumber);
	}


We create our own custom row renderer to format the data and then refresh the model. The refresh model function will create a new PagingListModel and attach that model to the grid, hence displaying the first page of users. Notice the speed and low memory usage of this as we only pull the amount of users that we require for one page and hold that in memory!

refreshModel

private void refreshModel(int activePage){
		userPaging.setPageSize(_pageSize);
		model = new PagingListModel(activePage, _pageSize);
		
		if(_needsTotalSizeUpdate) {
			_totalSize = model.getTotalSize();
			_needsTotalSizeUpdate = false;
		}
		
		userPaging.setTotalSize(_totalSize);
		
		dataGrid.setModel(model);	
	}

When refreshing the model the first thing we do is set the pageSize of the paging system, this is always set to 20, however in your own custom implementations it can be changed to any value you like. After setting the page size and creating the new PagingListModel we set the total size of the data in the paging system, this is a count of the number of items in the database. Finally the model is attached to the dataGrid using the dataGrid's setModel method.

GUI events

	public void onClick$btnGenerate() {	
		int numberOfUsers = DatabaseInformation.dao.getUserCount();
		final int numberOfEntriesRequired = 1000000;
		
		int usersRequired = numberOfEntriesRequired - numberOfUsers;
		
		if(usersRequired > 0) {
			User[] users = new User[usersRequired];
						
			for(int i=0; i<usersRequired; i++) {
				users[i] = GenerateData.generateUser();
			}
				
			DatabaseInformation.dao.insertUser(users);
			
			_needsTotalSizeUpdate = true;
			
			refreshModel(userPaging.getActivePage());
		}
		else
		{
			try {
				Messagebox.show("There are already 1,000,000 entries in the database!");
			} catch (InterruptedException e) {
				e.printStackTrace();
			}
		}
	}


The other two functions are related to handling GUI events, the first is onClick$btnGenerate which is used to generate 1,000,000 entries in our database. This functionality is self explanatory and not the focus of the Small Talk so we will move onto the second function onPaging$userPaging.

public void onPaging$userPaging(ForwardEvent event){
		final PagingEvent pe = (PagingEvent) event.getOrigin();
		_startPageNumber = pe.getActivePage();
		refreshModel(_startPageNumber);
	}


The function is small but is essential in regulating the model, the function retrieves the active page from the Paging event and then refreshes the model. Upon refreshing the model as described above the model is recreated and then set to the data grid, thus moving the paging along.

Important information

setPaginal

It is important that you do not set the paging element of your listbox or grid. It is not advisable to have grid controlling a paginal component where the total number of items doesn’t match the number of grid’s list items.


pageSize and totalSize

To retrieve the pageSize you need to use the method getPageSize() of the paging component rather than the grid or listbox. In addition we have to make sure we set the total size of the paging component to that of the total number of items in your database table, or what is returned from the query.


Download

Please download the source here.

You can also download the WAR file here.




Copyright © Potix Corporation. This article is licensed under GNU Free Documentation License.