spreadsheet-reader

A PHP spreadsheet reader (Excel XLS and XLSX, OpenOffice ODS, and variously separated text files) with a singular goal of getting the data out, efficiently

  • 所有者: nuovo/spreadsheet-reader
  • 平台:
  • 許可證: Other
  • 分類:
  • 主題:
  • 喜歡:
    0
      比較:

Github星跟蹤圖

spreadsheet-reader is a PHP spreadsheet reader that differs from others in that the main goal for it was efficient
data extraction that could handle large (as in really large) files. So far it may not definitely be CPU, time
or I/O-efficient but at least it won't run out of memory (except maybe for XLS files).

So far XLSX, ODS and text/CSV file parsing should be memory-efficient. XLS file parsing is done with php-excel-reader
from http://code.google.com/p/php-excel-reader/ which, sadly, has memory issues with bigger spreadsheets, as it reads the
data all at once and keeps it all in memory.

Requirements:

Usage:

All data is read from the file sequentially, with each row being returned as a numeric array.
This is about the easiest way to read a file:

<?php
	// If you need to parse XLS files, include php-excel-reader
	require('php-excel-reader/excel_reader2.php');

	require('SpreadsheetReader.php');

	$Reader = new SpreadsheetReader('example.xlsx');
	foreach ($Reader as $Row)
	{
		print_r($Row);
	}
?>

However, now also multiple sheet reading is supported for file formats where it is possible. (In case of CSV, it is handled as if
it only has one sheet.)

You can retrieve information about sheets contained in the file by calling the Sheets() method which returns an array with
sheet indexes as keys and sheet names as values. Then you can change the sheet that's currently being read by passing that index
to the ChangeSheet($Index) method.

Example:

<?php
	$Reader = new SpreadsheetReader('example.xlsx');
	$Sheets = $Reader -> Sheets();

	foreach ($Sheets as $Index => $Name)
	{
		echo 'Sheet #'.$Index.': '.$Name;

		$Reader -> ChangeSheet($Index);

		foreach ($Reader as $Row)
		{
			print_r($Row);
		}
	}
?>

If a sheet is changed to the same that is currently open, the position in the file still reverts to the beginning, so as to conform
to the same behavior as when changed to a different sheet.

Testing

From the command line:

php test.php path-to-spreadsheet.xls

In the browser:

http://path-to-library/test.php?File=/path/to/spreadsheet.xls

Notes about library performance

  • CSV and text files are read strictly sequentially so performance should be O(n);
  • When parsing XLS files, all of the file content is read into memory so large XLS files can lead to "out of memory" errors;
  • XLSX files use so called "shared strings" internally to optimize for cases where the same string is repeated multiple times.
    Internally XLSX is an XML text that is parsed sequentially to extract data from it, however, in some cases these shared strings are a problem -
    sometimes Excel may put all, or nearly all of the strings from the spreadsheet in the shared string file (which is a separate XML text), and not necessarily in the same
    order. Worst case scenario is when it is in reverse order - for each string we need to parse the shared string XML from the beginning, if we want to avoid keeping the data in memory.
    To that end, the XLSX parser has a cache for shared strings that is used if the total shared string count is not too high. In case you get out of memory errors, you can
    try adjusting the SHARED_STRING_CACHE_LIMIT constant in SpreadsheetReader_XLSX to a lower one.

TODOs:

  • ODS date formats;

Licensing

All of the code in this library is licensed under the MIT license as included in the LICENSE file, however, for now the library
relies on php-excel-reader library for XLS file parsing which is licensed under the PHP license.

主要指標

概覽
名稱與所有者nuovo/spreadsheet-reader
主編程語言PHP
編程語言PHP (語言數: 1)
平台
許可證Other
所有者活动
創建於2012-01-13 03:49:39
推送於2023-02-14 09:52:16
最后一次提交2015-04-30 11:54:58
發布數1
最新版本名稱v0.5.11 (發布於 2015-04-30 11:55:58)
第一版名稱v0.5.11 (發布於 2015-04-30 11:55:58)
用户参与
星數673
關注者數49
派生數495
提交數90
已啟用問題?
問題數105
打開的問題數67
拉請求數31
打開的拉請求數29
關閉的拉請求數21
项目设置
已啟用Wiki?
已存檔?
是復刻?
已鎖定?
是鏡像?
是私有?