How to read file using NPOI

C#.NetExcelNpoi

C# Problem Overview


I found NPOI is very good to write Excel files with C#.

But I want to open, read and modify Excel files in C#.

How can I do this?

C# Solutions


Solution 1 - C#

Simple read example below:

using NPOI.HSSF.UserModel;
using NPOI.SS.UserModel;

//.....

private void button1_Click(object sender, EventArgs e)
{
    HSSFWorkbook hssfwb;
    using (FileStream file = new FileStream(@"c:\test.xls", FileMode.Open, FileAccess.Read))
    {
        hssfwb= new HSSFWorkbook(file);
    }

    ISheet sheet = hssfwb.GetSheet("Arkusz1");
    for (int row = 0; row <= sheet.LastRowNum; row++)
    {
        if (sheet.GetRow(row) != null) //null is when the row only contains empty cells 
        {
            MessageBox.Show(string.Format("Row {0} = {1}", row, sheet.GetRow(row).GetCell(0).StringCellValue));
        }
    }
}  

By the way: on NPOI website here in Download section there is example package - a pack of C# examples. Try it, if you haven't yet. :)

Solution 2 - C#

It might be helpful to rely on the Workbook factory to instantiate the workbook object since the factory method will do the detection of xls or xlsx for you. Reference: http://apache-poi.1045710.n5.nabble.com/How-to-check-for-valid-excel-files-using-POI-without-checking-the-file-extension-td2341055.html

IWorkbook workbook = WorkbookFactory.Create(inputStream);

If you're not sure of the Sheet's name but you are sure of the index (0 based), you can grab the sheet like this:

ISheet sheet = workbook.GetSheetAt(sheetIndex);

You can then iterate through the rows using code supplied by the accepted answer from mj82

Solution 3 - C#

I find NPOI very usefull for working with Excel Files, here is my implementation (Comments are in Spanish, sorry for that):

This Method Opens an Excel (both xls or xlsx) file and converts it into a DataTable.

		/// <summary>Abre un archivo de Excel (xls o xlsx) y lo convierte en un DataTable.
	/// LA PRIMERA FILA DEBE CONTENER LOS NOMBRES DE LOS CAMPOS.</summary>
	/// <param name="pRutaArchivo">Ruta completa del archivo a abrir.</param>
	/// <param name="pHojaIndex">Número (basado en cero) de la hoja que se desea abrir. 0 es la primera hoja.</param>
	private DataTable Excel_To_DataTable(string pRutaArchivo, int pHojaIndex)
	{
		// --------------------------------- //
		/* REFERENCIAS:
		 * NPOI.dll
		 * NPOI.OOXML.dll
		 * NPOI.OpenXml4Net.dll */
		// --------------------------------- //
		/* USING:
		 * using NPOI.SS.UserModel;
		 * using NPOI.HSSF.UserModel;
		 * using NPOI.XSSF.UserModel; */
		// AUTOR: Ing. Jhollman Chacon R. 2015
		// --------------------------------- //
		DataTable Tabla = null;
		try
		{
			if (System.IO.File.Exists(pRutaArchivo))
			{

				IWorkbook workbook = null;	//IWorkbook determina si es xls o xlsx	 			
				ISheet worksheet = null;
				string first_sheet_name = "";

				using (FileStream FS = new FileStream(pRutaArchivo, FileMode.Open, FileAccess.Read))
				{
					workbook = WorkbookFactory.Create(FS);			//Abre tanto XLS como XLSX
					worksheet = workbook.GetSheetAt(pHojaIndex);	//Obtener Hoja por indice
					first_sheet_name = worksheet.SheetName;			//Obtener el nombre de la Hoja

					Tabla = new DataTable(first_sheet_name);
					Tabla.Rows.Clear();
					Tabla.Columns.Clear();

					// Leer Fila por fila desde la primera
					for (int rowIndex = 0; rowIndex <= worksheet.LastRowNum; rowIndex++)
					{
						DataRow NewReg = null;
						IRow row = worksheet.GetRow(rowIndex);
						IRow row2 = null;
						IRow row3 = null;

						if (rowIndex == 0)
						{
							row2 = worksheet.GetRow(rowIndex + 1); //Si es la Primera fila, obtengo tambien la segunda para saber el tipo de datos
							row3 = worksheet.GetRow(rowIndex + 2); //Y la tercera tambien por las dudas
						}

						if (row != null) //null is when the row only contains empty cells 
						{
							if (rowIndex > 0) NewReg = Tabla.NewRow();

							int colIndex = 0;
							//Leer cada Columna de la fila
							foreach (ICell cell in row.Cells)
							{
								object valorCell = null;
								string cellType = "";
								string[] cellType2 = new string[2];

								if (rowIndex == 0) //Asumo que la primera fila contiene los titlos:
								{
									for (int i = 0; i < 2; i++)
									{
										ICell cell2 = null;
										if (i == 0) { cell2 = row2.GetCell(cell.ColumnIndex); }
										else { cell2 = row3.GetCell(cell.ColumnIndex); }

										if (cell2 != null)
										{
											switch (cell2.CellType)
											{
												case CellType.Blank: break;
												case CellType.Boolean: cellType2[i] = "System.Boolean"; break;
												case CellType.String: cellType2[i] = "System.String"; break;
												case CellType.Numeric:
													if (HSSFDateUtil.IsCellDateFormatted(cell2)) { cellType2[i] = "System.DateTime"; }
													else
													{
														cellType2[i] = "System.Double";  //valorCell = cell2.NumericCellValue;
													}
													break;

												case CellType.Formula:														
													bool continuar = true;
													switch (cell2.CachedFormulaResultType)
													{
														case CellType.Boolean: cellType2[i] = "System.Boolean"; break;
														case CellType.String: cellType2[i] = "System.String"; break;
														case CellType.Numeric:
															if (HSSFDateUtil.IsCellDateFormatted(cell2)) { cellType2[i] = "System.DateTime"; }
															else
															{
																try
																{
																	//DETERMINAR SI ES BOOLEANO
																	if (cell2.CellFormula == "TRUE()") { cellType2[i] = "System.Boolean"; continuar = false; }
																	if (continuar && cell2.CellFormula == "FALSE()") { cellType2[i] = "System.Boolean"; continuar = false; }
																	if (continuar) { cellType2[i] = "System.Double"; continuar = false; }
																}
																catch { }
															} break;
													}
													break;
												default:
													cellType2[i] = "System.String"; break;
											}
										}
									}

									//Resolver las diferencias de Tipos
									if (cellType2[0] == cellType2[1]) { cellType = cellType2[0]; }
									else
									{
										if (cellType2[0] == null) cellType = cellType2[1];
										if (cellType2[1] == null) cellType = cellType2[0];
										if (cellType == "") cellType = "System.String";
									}

									//Obtener el nombre de la Columna
									string colName = "Column_{0}";
									try { colName = cell.StringCellValue; }
									catch { colName = string.Format(colName, colIndex); }

									//Verificar que NO se repita el Nombre de la Columna
									foreach (DataColumn col in Tabla.Columns)
									{
										if (col.ColumnName == colName) colName = string.Format("{0}_{1}", colName, colIndex);
									}

									//Agregar el campos de la tabla:
									DataColumn codigo = new DataColumn(colName, System.Type.GetType(cellType));
									Tabla.Columns.Add(codigo); colIndex++;
								}
								else
								{
									//Las demas filas son registros:
									switch (cell.CellType)
									{
										case CellType.Blank: valorCell = DBNull.Value; break;
										case CellType.Boolean: valorCell = cell.BooleanCellValue; break;
										case CellType.String: valorCell = cell.StringCellValue; break;
										case CellType.Numeric:
											if (HSSFDateUtil.IsCellDateFormatted(cell)) { valorCell = cell.DateCellValue; }
											else { valorCell = cell.NumericCellValue; } break;
										case CellType.Formula:
											switch (cell.CachedFormulaResultType)
											{
												case CellType.Blank: valorCell = DBNull.Value; break;
												case CellType.String: valorCell = cell.StringCellValue; break;
												case CellType.Boolean: valorCell = cell.BooleanCellValue; break;
												case CellType.Numeric:
													if (HSSFDateUtil.IsCellDateFormatted(cell)) { valorCell = cell.DateCellValue; }
													else { valorCell = cell.NumericCellValue; }
													break;
											}
											break;
										default: valorCell = cell.StringCellValue; break;
									}
									//Agregar el nuevo Registro
									if (cell.ColumnIndex <= Tabla.Columns.Count - 1) NewReg[cell.ColumnIndex] = valorCell;
								}
							}
						}
						if (rowIndex > 0) Tabla.Rows.Add(NewReg);
					}
					Tabla.AcceptChanges();
				}
			}
			else
			{
				throw new Exception("ERROR 404: El archivo especificado NO existe.");
			}
		}
		catch (Exception ex)
		{
			throw ex;
		}
		return Tabla;
	}

This Second method does the oposite, saves a DataTable into an Excel File, yeah it can either be xls or the new xlsx, your choise!

		/// <summary>Convierte un DataTable en un archivo de Excel (xls o Xlsx) y lo guarda en disco.</summary>
	/// <param name="pDatos">Datos de la Tabla a guardar. Usa el nombre de la tabla como nombre de la Hoja</param>
	/// <param name="pFilePath">Ruta del archivo donde se guarda.</param>
	private void DataTable_To_Excel(DataTable pDatos, string pFilePath)
	{
		try
		{
			if (pDatos != null && pDatos.Rows.Count > 0)
			{
				IWorkbook workbook = null;
				ISheet worksheet = null;

				using (FileStream stream = new FileStream(pFilePath, FileMode.Create, FileAccess.ReadWrite))
				{
					string Ext = System.IO.Path.GetExtension(pFilePath); //<-Extension del archivo
					switch (Ext.ToLower())
					{
						case ".xls":
							HSSFWorkbook workbookH = new HSSFWorkbook();
							NPOI.HPSF.DocumentSummaryInformation dsi = NPOI.HPSF.PropertySetFactory.CreateDocumentSummaryInformation();
							dsi.Company = "Cutcsa"; dsi.Manager = "Departamento Informatico";
							workbookH.DocumentSummaryInformation = dsi;
							workbook = workbookH;
							break;

						case ".xlsx": workbook = new XSSFWorkbook(); break;
					}

					worksheet = workbook.CreateSheet(pDatos.TableName); //<-Usa el nombre de la tabla como nombre de la Hoja

					//CREAR EN LA PRIMERA FILA LOS TITULOS DE LAS COLUMNAS
					int iRow = 0;
					if (pDatos.Columns.Count > 0)
					{
						int iCol = 0;
						IRow fila = worksheet.CreateRow(iRow);
						foreach (DataColumn columna in pDatos.Columns)
						{
							ICell cell = fila.CreateCell(iCol, CellType.String);
							cell.SetCellValue(columna.ColumnName);
							iCol++;
						}
						iRow++;
					}

					//FORMATOS PARA CIERTOS TIPOS DE DATOS
					ICellStyle _doubleCellStyle = workbook.CreateCellStyle();
					_doubleCellStyle.DataFormat = workbook.CreateDataFormat().GetFormat("#,##0.###");

					ICellStyle _intCellStyle = workbook.CreateCellStyle();
					_intCellStyle.DataFormat = workbook.CreateDataFormat().GetFormat("#,##0");

					ICellStyle _boolCellStyle = workbook.CreateCellStyle();
					_boolCellStyle.DataFormat = workbook.CreateDataFormat().GetFormat("BOOLEAN");

					ICellStyle _dateCellStyle = workbook.CreateCellStyle();
					_dateCellStyle.DataFormat = workbook.CreateDataFormat().GetFormat("dd-MM-yyyy");

					ICellStyle _dateTimeCellStyle = workbook.CreateCellStyle();
					_dateTimeCellStyle.DataFormat = workbook.CreateDataFormat().GetFormat("dd-MM-yyyy HH:mm:ss");

					//AHORA CREAR UNA FILA POR CADA REGISTRO DE LA TABLA
					foreach (DataRow row in pDatos.Rows)
					{
						IRow fila = worksheet.CreateRow(iRow);
						int iCol = 0;
						foreach (DataColumn column in pDatos.Columns)
						{
							ICell cell = null; //<-Representa la celda actual								
							object cellValue = row[iCol]; //<- El valor actual de la celda

							switch (column.DataType.ToString())
							{
								case "System.Boolean":
									if (cellValue != DBNull.Value)
									{
										cell = fila.CreateCell(iCol, CellType.Boolean);

										if (Convert.ToBoolean(cellValue)) { cell.SetCellFormula("TRUE()"); }
										else { cell.SetCellFormula("FALSE()"); }

										cell.CellStyle = _boolCellStyle;
									}
									break;

								case "System.String":
									if (cellValue != DBNull.Value)
									{
										cell = fila.CreateCell(iCol, CellType.String);
										cell.SetCellValue(Convert.ToString(cellValue));
									}
									break;

								case "System.Int32":
									if (cellValue != DBNull.Value)
									{
										cell = fila.CreateCell(iCol, CellType.Numeric);
										cell.SetCellValue(Convert.ToInt32(cellValue));
										cell.CellStyle = _intCellStyle;
									}
									break;
								case "System.Int64":
									if (cellValue != DBNull.Value)
									{
										cell = fila.CreateCell(iCol, CellType.Numeric);
										cell.SetCellValue(Convert.ToInt64(cellValue));
										cell.CellStyle = _intCellStyle;
									}
									break;
								case "System.Decimal":
									if (cellValue != DBNull.Value)
									{
										cell = fila.CreateCell(iCol, CellType.Numeric);
										cell.SetCellValue(Convert.ToDouble(cellValue));
										cell.CellStyle = _doubleCellStyle;
									}
									break;
								case "System.Double":
									if (cellValue != DBNull.Value)
									{
										cell = fila.CreateCell(iCol, CellType.Numeric);
										cell.SetCellValue(Convert.ToDouble(cellValue));
										cell.CellStyle = _doubleCellStyle;
									}
									break;

								case "System.DateTime":
									if (cellValue != DBNull.Value)
									{
										cell = fila.CreateCell(iCol, CellType.Numeric);
										cell.SetCellValue(Convert.ToDateTime(cellValue));

										//Si No tiene valor de Hora, usar formato dd-MM-yyyy
										DateTime cDate = Convert.ToDateTime(cellValue);
										if (cDate != null && cDate.Hour > 0) { cell.CellStyle = _dateTimeCellStyle; }
										else { cell.CellStyle = _dateCellStyle; }
									}
									break;
								default:
									break;
							}
							iCol++;
						}
						iRow++;
					}

					workbook.Write(stream);
					stream.Close();
				}
			}
		}
		catch (Exception ex)
		{
			throw ex;
		}
	}

With this 2 methods you can Open an Excel file, load it into a DataTable, do your modifications and save it back into an Excel file.

Hope you guys find this usefull.

Solution 4 - C#

As Janoulle pointed out, you don't need to detect which extension it is if you use the WorkbookFactory, it will do it for you. I recently had to implement a solution using NPOI to read Excel files and import email addresses into a sql database. My main problem was that I was probably going to receive about 12 different Excel layouts from different customers so I needed something that could be changed quickly without much code. I ended up using Npoi.Mapper which is an awesome tool! Highly recommended!

Here is my complete solution:

using System.IO;
using System.Linq;
using Npoi.Mapper;
using Npoi.Mapper.Attributes;
using NPOI.SS.UserModel;

namespace JobCustomerImport.Processors
{
    public class ExcelEmailProcessor
    {
        private UserManagementServiceContext DataContext { get; }

        public ExcelEmailProcessor(int customerNumber)
        {
            DataContext = new UserManagementServiceContext();
        }

        public void Execute(string localPath, int sheetIndex)
        {
            IWorkbook workbook;
            using (FileStream file = new FileStream(localPath, FileMode.Open, FileAccess.Read))
            {
                workbook = WorkbookFactory.Create(file);
            }

            var importer = new Mapper(workbook);
            var items = importer.Take<MurphyExcelFormat>(sheetIndex);
            foreach(var item in items)
            {
                var row = item.Value;
                if (string.IsNullOrEmpty(row.EmailAddress))
                    continue;

                UpdateUser(row);
            }

            DataContext.SaveChanges();
        }

        private void UpdateUser(MurphyExcelFormat row)
        {
            //LOGIC HERE TO UPDATE A USER IN DATABASE...
        }

        private class MurphyExcelFormat
        {
            [Column("District")]
            public int District { get; set; }

            [Column("DM")]
            public string FullName { get; set; }

            [Column("Email Address")]
            public string EmailAddress { get; set; }

            [Column(3)]
            public string Username { get; set; }

            public string FirstName
            {
                get
                {
                    return Username.Split('.')[0];
                }
            }

            public string LastName
            {
                get
                {
                    return Username.Split('.')[1];
                }
            }
        }
    }
}

I am so happy with NPOI + Npoi.Mapper (from Donny Tian) as an Excel import solution that I wrote a blog post about it, going in to more detail about this code above. You can read it here if you wish: Easiest way to import excel files. The best thing about this solution is that it runs perfectly in a serverless azure/cloud environment which I couldn't get with other Excel tools/libraries.

Solution 5 - C#

private DataTable GetDataTableFromExcel(String Path)
    {
        XSSFWorkbook wb;
        XSSFSheet sh;
        String Sheet_name;

        using (var fs = new FileStream(Path, FileMode.Open, FileAccess.Read))
        {
            wb = new XSSFWorkbook(fs);
                         
               Sheet_name= wb.GetSheetAt(0).SheetName;  //get first sheet name
        }
        DataTable DT = new DataTable();
        DT.Rows.Clear();
        DT.Columns.Clear();

        // get sheet
        sh = (XSSFSheet)wb.GetSheet(Sheet_name);

        int i = 0;
        while (sh.GetRow(i) != null)
        {
            // add neccessary columns
            if (DT.Columns.Count < sh.GetRow(i).Cells.Count)
            {
                for (int j = 0; j < sh.GetRow(i).Cells.Count; j++)
                {
                    DT.Columns.Add("", typeof(string));
                }
            }

            // add row
            DT.Rows.Add();

            // write row value
            for (int j = 0; j < sh.GetRow(i).Cells.Count; j++)
            {
                var cell = sh.GetRow(i).GetCell(j);

                if (cell != null)
                {
                    // TODO: you can add more cell types capatibility, e. g. formula
                    switch (cell.CellType)
                    {
                        case NPOI.SS.UserModel.CellType.Numeric:
                            DT.Rows[i][j] = sh.GetRow(i).GetCell(j).NumericCellValue;
                            //dataGridView1[j, i].Value = sh.GetRow(i).GetCell(j).NumericCellValue;

                            break;
                        case NPOI.SS.UserModel.CellType.String:
                            DT.Rows[i][j] = sh.GetRow(i).GetCell(j).StringCellValue;

                            break;
                    }
                }
            }

            i++;
        }

        return DT;
    }

Solution 6 - C#

private static ISheet GetFileStream(string fullFilePath)
    {
        var fileExtension = Path.GetExtension(fullFilePath);
        string sheetName;
        ISheet sheet = null;
        switch (fileExtension)
        {
            case ".xlsx":
                using (var fs = new FileStream(fullFilePath, FileMode.Open, FileAccess.Read))
                {
                    var wb = new XSSFWorkbook(fs);
                    sheetName = wb.GetSheetAt(0).SheetName;
                    sheet = (XSSFSheet) wb.GetSheet(sheetName);
                }
                break;
            case ".xls":
                using (var fs = new FileStream(fullFilePath, FileMode.Open, FileAccess.Read))
                {
                    var wb = new HSSFWorkbook(fs);
                    sheetName = wb.GetSheetAt(0).SheetName;
                    sheet = (HSSFSheet) wb.GetSheet(sheetName);
                }
                break;
        }
        return sheet;
    }

    private static DataTable GetRequestsDataFromExcel(string fullFilePath)
    {
        try
        {
            var sh = GetFileStream(fullFilePath);
            var dtExcelTable = new DataTable();
            dtExcelTable.Rows.Clear();
            dtExcelTable.Columns.Clear();
            var headerRow = sh.GetRow(0);
            int colCount = headerRow.LastCellNum;
            for (var c = 0; c < colCount; c++)
                dtExcelTable.Columns.Add(headerRow.GetCell(c).ToString());
            var i = 1;
            var currentRow = sh.GetRow(i);
            while (currentRow != null)
            {
                var dr = dtExcelTable.NewRow();
                for (var j = 0; j < currentRow.Cells.Count; j++)
                {
                    var cell = currentRow.GetCell(j);

                    if (cell != null)
                        switch (cell.CellType)
                        {
                            case CellType.Numeric:
                                dr[j] = DateUtil.IsCellDateFormatted(cell)
                                    ? cell.DateCellValue.ToString(CultureInfo.InvariantCulture)
                                    : cell.NumericCellValue.ToString(CultureInfo.InvariantCulture);
                                break;
                            case CellType.String:
                                dr[j] = cell.StringCellValue;
                                break;
                            case CellType.Blank:
                                dr[j] = string.Empty;
                                break;
                        }
                }
                dtExcelTable.Rows.Add(dr);
                i++;
                currentRow = sh.GetRow(i);
            }
            return dtExcelTable;
        }
        catch (Exception e)
        {
            throw;
        }
    }

Solution 7 - C#

Since you've asked to read and modify the xls file I have changed @mj82's answer to correspond your needs.

HSSFWorkbook does not have Save method, but it does have Write to a stream.

static void Main(string[] args)
        {
            string filepath = @"C:\test.xls";

            HSSFWorkbook hssfwb;

            using (FileStream file = new FileStream(filepath, FileMode.Open, FileAccess.Read))
            {
                hssfwb = new HSSFWorkbook(file);
            }

            ISheet sheet = hssfwb.GetSheetAt(0);

            for (int row = 0; row <= sheet.LastRowNum; row++)
            {
                if (sheet.GetRow(row) != null) //null is when the row only contains empty cells 
                {
                    // Set new cell value
                    sheet.GetRow(row).GetCell(0).SetCellValue("foo");
                    Console.WriteLine("Row {0} = {1}", row, sheet.GetRow(row).GetCell(0).StringCellValue);
                }
            }

            // Save the file
            using (FileStream file = new FileStream(filepath, FileMode.Open, FileAccess.Write))
            {
                hssfwb.Write(file);
            }

            Console.ReadLine();
        }

Solution 8 - C#

If you don't want to use NPOI.Mapper, then I'd advise you to check out this solution - it handles reading excel cell into various type and also has a simple import helper: https://github.com/hidegh/NPOI.Extensions

var data = sheet.MapTo<OrderDetails>(true, rowMapper =>
{
  // map singleItem
  return new OrderDetails()
  {
    Date = rowMapper.GetValue<DateTime>(SheetColumnTitles.Date),
            
    // use reusable mapper for re-curring scenarios
    Region = regionMapper(rowMapper.GetValue<string>(SheetColumnTitles.Region)),
            
    Representative = rowMapper.GetValue<string>(SheetColumnTitles.Representative),
    Item = rowMapper.GetValue<string>(SheetColumnTitles.Item),
    Units = rowMapper.GetValue<int>(SheetColumnTitles.Units),
    UnitCost = rowMapper.GetValue<decimal>(SheetColumnTitles.UnitCost),
    Total = rowMapper.GetValue<decimal>(SheetColumnTitles.Total),
            
    // read date and total as string, as they're displayed/formatted on the excel
    DateFormatted = rowMapper.GetValue<string>(SheetColumnTitles.Date),
    TotalFormatted = rowMapper.GetValue<string>(SheetColumnTitles.Total)
  };
});

Attributions

All content for this solution is sourced from the original question on Stackoverflow.

The content on this page is licensed under the Attribution-ShareAlike 4.0 International (CC BY-SA 4.0) license.

Content TypeOriginal AuthorOriginal Content on Stackoverflow
QuestionAkhilView Question on Stackoverflow
Solution 1 - C#mj82View Answer on Stackoverflow
Solution 2 - C#janoulleView Answer on Stackoverflow
Solution 3 - C#JhollmanView Answer on Stackoverflow
Solution 4 - C#DanView Answer on Stackoverflow
Solution 5 - C#LuckySView Answer on Stackoverflow
Solution 6 - C#Shailesh B. RathiView Answer on Stackoverflow
Solution 7 - C#gyosifovView Answer on Stackoverflow
Solution 8 - C#baHIView Answer on Stackoverflow