Friday, September 27, 2019

How to Read Excel Files in Java using Apache POI

1. Getting Apache POI library

Apache POI is the pure Java API for reading and writing Excel files in both formats XLS (Excel 2003 and earlier) and XLSX (Excel 2007 and later). To use Apache POI in your Java project:
  • For non-Maven projects:
    • Download the latest release of the library here: Apache POI - Download Release Artifacts
      Extract the zip file and add the appropriate JAR files to your project’s classpath:
      - If you are reading and writing only Excel 2003 format, only the file poi-VERSION.jar is enough.
      - If you are reading and writing Excel 2007 format, you have to include the following files:
      • poi-ooxml-VERSION.jar
      • poi-ooxml-schemas-VERSION.jar
      • xmlbeans-VERSION.jar


  • For Maven projects: Add the following dependency to your project’s pom.xml file:
    • For Excel 2003 format only:
      1
      2
      3
      4
      5
      <dependency>
          <groupId>org.apache.poi</groupId>
          <artifactId>poi</artifactId>
          <version>VERSION</version>
      </dependency>
    • For Excel 2007 format:
      1
      2
      3
      4
      5
      <dependency>
          <groupId>org.apache.poi</groupId>
          <artifactId>poi-ooxml</artifactId>
          <version>VERSION</version>
      </dependency>
       The latest stable version of Apache POI is 3.11 (at the time of writing this tutorial).
 

2. The Apache POI API Basics

There are two main prefixes which you will encounter when working with Apache POI:
  • HSSF: denotes the API is for working with Excel 2003 and earlier.
  • XSSF: denotes the API is for working with Excel 2007 and later.
And to get started the Apache POI API, you just need to understand and use the following 4 interfaces:
  • Workbook: high level representation of an Excel workbook. Concrete implementations are: HSSFWorkbookand XSSFWorkbook.
  • Sheet: high level representation of an Excel worksheet. Typical implementing classes are HSSFSheetand XSSFSheet.
  • Row: high level representation of a row in a spreadsheet. HSSFRowand XSSFRoware two concrete classes.
  • Cell: high level representation of a cell in a row. HSSFCelland XSSFCellare the typical implementing classes.
Now, let’s walk through some real-life examples.
 

3. Reading from Excel File Examples

Suppose you want to read an Excel file whose content looks like the following screenshot:
Books Excel File
This spreadsheet contains information about books (title, author and price).
 

A Simple Example to Read Excel File in Java

Here’s a dirty example that reads every cell in the first sheet of the workbook and prints out values in every cell, row by row:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
package net.codejava.excel;
 
import java.io.File;
import java.io.FileInputStream;
import java.io.IOException;
import java.util.Iterator;
 
import org.apache.poi.ss.usermodel.Cell;
import org.apache.poi.ss.usermodel.Row;
import org.apache.poi.ss.usermodel.Sheet;
import org.apache.poi.ss.usermodel.Workbook;
import org.apache.poi.xssf.usermodel.XSSFWorkbook;
 
/**
 * A dirty simple program that reads an Excel file.
 * @author www.codejava.net
 *
 */
public class SimpleExcelReaderExample {
     
    public static void main(String[] args) throws IOException {
        String excelFilePath = "Books.xlsx";
        FileInputStream inputStream = new FileInputStream(new File(excelFilePath));
         
        Workbook workbook = new XSSFWorkbook(inputStream);
        Sheet firstSheet = workbook.getSheetAt(0);
        Iterator<Row> iterator = firstSheet.iterator();
         
        while (iterator.hasNext()) {
            Row nextRow = iterator.next();
            Iterator<Cell> cellIterator = nextRow.cellIterator();
             
            while (cellIterator.hasNext()) {
                Cell cell = cellIterator.next();
                 
                switch (cell.getCellType()) {
                    case Cell.CELL_TYPE_STRING:
                        System.out.print(cell.getStringCellValue());
                        break;
                    case Cell.CELL_TYPE_BOOLEAN:
                        System.out.print(cell.getBooleanCellValue());
                        break;
                    case Cell.CELL_TYPE_NUMERIC:
                        System.out.print(cell.getNumericCellValue());
                        break;
                }
                System.out.print(" - ");
            }
            System.out.println();
        }
         
        workbook.close();
        inputStream.close();
    }
 
}
Output:
1
2
3
4
Head First Java - Kathy Serria - 79.0 -
Effective Java - Joshua Bloch - 36.0 -
Clean Code - Robert Martin - 42.0 -
Thinking in Java - Bruce Eckel - 35.0 -
 

A More Object-Oriented Example to read Excel File

For nicer and more object-oriented program, let’s create a model class (Book.java) with the following code:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
package net.codejava.excel;
 
public class Book {
    private String title;
    private String author;
    private float price;
 
    public Book() {
    }
 
    public String toString() {
        return String.format("%s - %s - %f", title, author, price);
    }
 
    // getters and setters
}
Write a method that reads value of a cell as following:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
private Object getCellValue(Cell cell) {
    switch (cell.getCellType()) {
    case Cell.CELL_TYPE_STRING:
        return cell.getStringCellValue();
 
    case Cell.CELL_TYPE_BOOLEAN:
        return cell.getBooleanCellValue();
 
    case Cell.CELL_TYPE_NUMERIC:
        return cell.getNumericCellValue();
    }
 
    return null;
}
Next, implement a method that reads an Excel file and returns a list of books:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
public List<Book> readBooksFromExcelFile(String excelFilePath) throws IOException {
    List<Book> listBooks = new ArrayList<>();
    FileInputStream inputStream = new FileInputStream(new File(excelFilePath));
 
    Workbook workbook = new XSSFWorkbook(inputStream);
    Sheet firstSheet = workbook.getSheetAt(0);
    Iterator<Row> iterator = firstSheet.iterator();
 
    while (iterator.hasNext()) {
        Row nextRow = iterator.next();
        Iterator<Cell> cellIterator = nextRow.cellIterator();
        Book aBook = new Book();
 
        while (cellIterator.hasNext()) {
            Cell nextCell = cellIterator.next();
            int columnIndex = nextCell.getColumnIndex();
 
            switch (columnIndex) {
            case 1:
                aBook.setTitle((String) getCellValue(nextCell));
                break;
            case 2:
                aBook.setAuthor((String) getCellValue(nextCell));
                break;
            case 3:
                aBook.setPrice((double) getCellValue(nextCell));
                break;
            }
 
 
        }
        listBooks.add(aBook);
    }
 
    workbook.close();
    inputStream.close();
 
    return listBooks;
}
And here is the testing code:
1
2
3
4
5
6
public static void main(String[] args) throws IOException {
    String excelFilePath = "Books.xlsx";
    ExcelReaderExample2 reader = new ExcelReaderExample2();
    List<Book> listBooks = reader.readBooksFromExcelFile(excelFilePath);
    System.out.println(listBooks);
}
Output:
1
2
[Head First Java - Kathy Serria - 79.000000, Effective Java - Joshua Bloch - 36.000000,
    Clean Code - Robert Martin - 42.000000, Thinking in Java - Bruce Eckel - 35.000000]
 

How to Read both Excel 2003 and 2007 format in Java

For better supporting both users using Excel 2003 and 2007, it’s recommended to write a separate factory method that returns an XSSFWorkbookor HSSFWorkbookdepending on the file extension of the file (.xls or .xlsx):
1
2
3
4
5
6
7
8
9
10
11
12
13
14
private Workbook getWorkbook(FileInputStream inputStream, String excelFilePath)
        throws IOException {
    Workbook workbook = null;
 
    if (excelFilePath.endsWith("xlsx")) {
        workbook = new XSSFWorkbook(inputStream);
    else if (excelFilePath.endsWith("xls")) {
        workbook = new HSSFWorkbook(inputStream);
    else {
        throw new IllegalArgumentException("The specified file is not Excel file");
    }
 
    return workbook;
}
And here’s a usage example of this factory method:
1
2
3
4
5
String excelFilePath = "Books.xlsx"// can be .xls or .xlsx
 
FileInputStream inputStream = new FileInputStream(new File(excelFilePath));
 
Workbook workbook = getWorkbook(inputStream, excelFilePath);
 

Reading Other Information

  • Get a specific sheet:
    1
    Sheet thirdSheet = workbook.getSheetAt(2);
  • Get sheet name:
    1
    String sheetName = sheet.getSheetName();
  • Get total number of sheets in the workbook:
    1
    int numberOfSheets = workbook.getNumberOfSheets();
  • Get all sheet names in the workbook:
    1
    2
    3
    4
    5
    6
    int numberOfSheets = workbook.getNumberOfSheets();
     
    for (int i = 0; i < numberOfSheets; i++) {
        Sheet aSheet = workbook.getSheetAt(i);
        System.out.println(aSheet.getSheetName());
    }
  • Get comment of a specific cell:
    1
    2
    Comment cellComment = sheet.getCellComment(22);
    System.out.println("comment: " + cellComment.getString());
    For reading other information, see the getXXX() methods of the WorkbookSheetRow and Cell interfaces.

No comments:

Post a Comment

How to DROP SEQUENCE in Oracle?

  Oracle  DROP SEQUENCE   overview The  DROP SEQUENCE  the statement allows you to remove a sequence from the database. Here is the basic sy...