Perform Read-Write Operations in Excel using SQL commands from code

Coding is actually a lot of fun, I agree! But there are times when you would want to just get an important task done, without writing a lot of lines of code for it. I came to know about this Java library “Fillo” last month and wanted to check out what it offers – and I am not disappointed! The Apache POI API provides some level of flexibility to play around with the excel worksheets, the rows and cells by providing a lot of interfaces and built-in methods. But my personal opinion is that, to perform read-write operations for data-driven automated tasks, the Java code sometimes become too verbose, complex and difficult to understand. I have observed that, in some test automation frameworks, the code to use POI has been written in such a complex way that if any new requirement comes regarding the driving of the data, it would call for a whole restructuring of the class/methods which implemented the excel data-driving logic.

By the way, POI stands for “Poor Obfuscation Implementation” – I didn’t know this. 🙂

For those of you who are familiar with the Relational Databases, probably an easier way to think of the Excel worksheets is to imagine them as “Tables”, with the columns as “Fields” and the rows as “Records”. And now, what if I tell you that you can programmatically read data from and write data into these excel worksheets using SQL statements from your Java code? – You will definitely like it.

Fillo – What is it?
Fillo is an open source Java API (created by Codoid) which can perform operations on Excel files with .xls, .xlsx and .xlsm file extensions. It is a wrapper on top of the Apache POI. What makes this API unique and pretty cool is that you can use some of the SQL (DDL, DQL and DML) commands to play around with the data present inside the excel worksheets. This makes data retrieval and data manipulation much easy when dealing with data-driven automated scripts. The latest version and the version I have used is v1.21. Of course, the API has its limitations which I will mention towards the end of the blog.

Fillo – Setup
Setting up and integrating Fillo to your Java project is quite straight forward. If you are using Gradle as the build management tool, you can add the below dependency inside the build.gradle file.

dependencies {
   compile group: 'com.codoid.products', name: 'fillo', version: '1.21'
}

If you are using Maven, you have to add the below in your pom.xml file

<dependency>
    <groupId>com.codoid.products</groupId>
    <artifactId>fillo</artifactId>
    <version>1.21</version>
</dependency> 

Fillo – Code
To make use of the Fillo library, we have to first create an object of Fillo and call its “getConnection” method by passing the path where the Excel file exists.

Fillo fillo = new Fillo();
Connection connection = fillo.getConnection(“./files/SampleExcelFile1.xlsx”);

This will return the connection object whose “executeQuery” method we can call by passing the SQL command as parameter.
It will return a recordset which can be thought of as a data structure that contains a group of the records found. At a given time, the recordset object refers to a single record from the table.

Recordset recordset = connection.executeQuery("SELECT * FROM InfoSheet1");

We can then use the next() method of the recordset to iterate over each record.

while (recordset.next()) {
      System.out.println(recordset.getField("<Enter the field/excel-column name here>")
}

Lets’ s now see what all SQL commands does Fillo support. We will consider the below data in the excel file for the operations.

StudentIDFirst NameLast NameProgrammingLanguageMarks Obtained (out of 100)
1SamSmithC72
2CharlesDoeJava84
3RobertChopraC++58
4RichardVaughanScala93
5CurtPhilipsScala83
6JaneCronjePython66

Fillo – SELECT
Simplest of all, you can use the SQL statement with the SELECT command to read data from the Excel file based on the column name. Let us consider the table name to be “InfoSheet1”. Using the below code we can retrieve all the data present inside the columns StudentID, First Name, Last Name, ProgrammingLanguage and Marks Obtained (out of 100).

@Test
public void selectTest(){
try {
    Fillo fillo = new Fillo();
    Connection connection = fillo.getConnection(“./files/SampleExcelFile1.xlsx”);
    Recordset recordset = connection.executeQuery("SELECT * FROM InfoSheet1");
    while (recordset.next()) {
	System.out.println(recordset.getField("StudentID") + " " + recordset.getField("First Name") + " "+ recordset.getField("Last Name") + " " + recordset.getField("ProgrammingLanguage") + " "
+ recordset.getField("Marks Obtained (out of 100)"));
  }
   recordset.close();
   connection.close();
  } catch (FilloException e) {
   e.printStackTrace();
  } catch (Exception e) {
   e.printStackTrace();
  }
}

You can also use the WHERE clause with the AND operator to further modify your search.

Query:
SELECT * FROM InfoSheet1 WHERE StudentID=2;
SELECT * FROM InfoSheet1 WHERE (\”Last Name\”=’Philips’ AND \”Marks Obtained (out of 100)\”=83)

Notice the escape sequence (\”\”) used in case where the column name consists of whitespaces and special characters. Same applies for the table name too e.g. if the table name is “Info Sheet2”, we have to use:
SELECT * FROM \”Info Sheet2\”

Some other functionalities:
To get data from a particular field:
recordset.getField(“<FieldName>”);

To move to the next record:
recordset.next();

To get the count of total rows:
recordset.getCount();

To move to the first row:
recordset.moveFirst();

To move to the next row:
recordset.moveNext();

To move to the last row:
recordset.moveLast();

To move to the previous row:
recordset.movePrevious();

To get all the fields (as an ArrayList):
recordset.getFieldNames();

To get the number of columns (using the size method of ArrayList):
recordset.getFieldNames().size();

To get the data from a particular column (field) using index (here, I have used the index 1):
recordset.getField(1).value();

To get all the worksheet names present in the excel workbook (as an ArrayList):
List<String> sheetNames = connection.getMetaData().getTableNames();

Using the where() method provided by Fillo (same as using the WHERE clause in the SQL statements):
Recordset recordset = connection.executeQuery(“SELECT * FROM InfoSheet1”).where(“\”Last Name\”=’Philips'”).where(“\”Marks Obtained (out of 100)\”=83”);

Support to check for records with null/non-null values:
SELECT * FROM InfoSheet1 WHERE StudentID = null
SELECT * FROM InfoSheet1 WHERE StudentID != null

Support to the LIKE operator:
SELECT * FROM InfoSheet1 WHERE ProgrammingLanguage LIKE ‘P%’

Support to set the starting row and starting column before performing operations:
System.setProperty(“ROW”,”2”);
System.setProperty(“COLUMN”,”4”);

Fillo – INSERT
We can insert data into the spreadsheet using the “INSERT” SQL command that we are all familiar with. Remember to use the “executeUpdate()” method instead of the “executeQuery()” method here. The below code will insert a row in the worksheet named “Info – Sheet3” with the data 21, Ron, Samuels, Java, 88 against the columns StudentID, First Name, Last Name, ProgrammingLanguage and Marks Obtained (out of 100) respectively.

@Test
public void insertTest() {
   String query = "INSERT INTO \"Info - Sheet3\"(StudentID,\"First Name\",\"Last Name\",ProgrammingLanguage,\"Marks Obtained (out of 100)\") VALUES(21,'Ron','Samuels','Java',88)";
   String filePath = "./files/SampleExcelFile1.xlsx";
   try {
     Fillo fillo = new Fillo();
     Connection connection = fillo.getConnection(filePath);
     connection.executeUpdate(query);
     connection.close();
   } catch (FilloException e) {
     e.printStackTrace();
   } catch (Exception e) {
     e.printStackTrace();
   }
}

Fillo – UPDATE
If we want to update a value present inside the spreadsheet, we can do that using the UPDATE command. Here also, we need to use the “executeUpdate()” method instead of the “executeQuery()” method.

@Test
public void updateTest() {
   String query = "UPDATE \"Info Sheet2\" SET ProgrammingLanguage='Scala' WHERE (StudentID=5 and \"First Name\"='Curt' AND \"Last Name\"='Philips')";
   String filePath = "./files/SampleExcelFile1.xlsx";
   try{
	Fillo fillo = new Fillo();
	Connection connection = fillo.getConnection(filePath);
	connection.executeUpdate(query);
	connection.close();
   } catch (FilloException e) {
	e.printStackTrace();
   } catch (Exception e) {
	e.printStackTrace();
   }
}

Fillo – Create and Delete Worksheet
Fillo provides the functionality to programmatically create and delete the worksheets too without requiring to write a large chunk of code. Below are the code snippets to create a spreadsheet with the columns and to delete the whole worksheet.

@Test
public void createSheet() {
    String filePath = "./files/SampleExcelFile1.xlsx";
    try {
        Fillo fillo = new Fillo();
        Connection connection = fillo.getConnection(filePath);
        connection.createTable("InfoSheet4", new String[] { "StudentID", "First Name", "Last Name",
					"ProgrammingLanguage", "Marks Obtained (out of 100)" });
        connection.close();
   } catch (FilloException e) {
     e.printStackTrace();
   } catch (Exception e) {
     e.printStackTrace();
   }
}
@Test
public void deleteSheet() {
    String filePath = "./files/SampleExcelFile1.xlsx";
    try {
        Fillo fillo = new Fillo();
	Connection connection = fillo.getConnection(filePath);
	connection.deleteTable("InfoSheet4");
	connection.close();
    } catch (FilloException e) {
	e.printStackTrace();
    } catch (Exception e) {
	e.printStackTrace();
    }
}

Fillo – things that won’t work
Though Fillo is flexible enough to perform much of the tasks needed to drive data from and into the code, it has some limitations too. Below are some of the things that won’t work with the present Fillo version (v1.21). These features may or may not be supported in the later versions.

Cell Coloring:
You won’t be able to color the cells in Excel using Fillo.

OR operator with the WHERE clause:
Unlike the AND operator, you will not be able to use the OR operator with the WHERE clause while sending the SQL commands to Fillo.

MIN, MAX, ORDER BY, DISTINCT, TOP
The SQL aggregate functions (like MIN, MAX, SUM, COUNT, AVG) and other SQL commands like ORDER BY, DISTINCT, TOP are still not supported.

SQL Subqueries
The SQL subqueries and correlated subqueries are not supported. You need to write separate queries to perform such functions.

Special Characters
If the data contains special characters, then it will throw exception. I was not able to get a solution to this problem and my basic analysis indicates that this is still not supported.

Conclusion:
Overall, analyzing the different functionalities that Fillo is providing right now, I think it will be effective in most of the cases where complex manipulation of the spreadsheets (and data inside it) is not involved. If simplifying a code to do certain actions with the help of an open-source API saves time, reduces complexity and becomes manageable, then it is worth giving it a try.

Credits:
https://codoid.com/fillo/