DEV Community

loading...

[TypeScript] Read spreadsheets by SheetJS

Masui Masanori
Programmer, husband, father I love C#, TypeScript, etc.
・4 min read

Intro

This time, I will try reading spreadsheets by SheetJS.

Environments

  • Node.js ver.16.5.0
  • TypeScript ver.4.3.5
  • Webpack ver.5.42.0
  • xlsx ver.0.17.0

Caution

SheetJS has Professional Edition.
But because I only tried Community Edition in this time, so maybe some my problems had been resolved in Professional Edition.

Base projects

Spreadsheet sample

Alt Text

index.html

<!DOCTYPE html>
<html lang="en">
    <head>
        <title>xlsx sample</title>
        <meta charset="utf-8">
        <link rel="stylesheet" href="../css/month_picker.css" />
    </head>
    <body>
        <input type="file" id="file_selector">
        <script src="js/main.page.js"></script>
    </body>
</html>
Enter fullscreen mode Exit fullscreen mode

main.page.ts

import * as sheetLoader from './spreadsheetLoader';
function init(): void {
    const fileSelector = document.getElementById('file_selector') as HTMLInputElement;
    fileSelector.onchange = async () => await loadSpreadsheet();
}
async function loadSpreadsheet(): Promise<void> {
    const fileSelector = document.getElementById('file_selector') as HTMLInputElement;
    if(fileSelector.files == null ||
        fileSelector.files.length <= 0) {
        alert('No Files');
        return;
    }
    await sheetLoader.load(fileSelector.files[0]);
}
init();
Enter fullscreen mode Exit fullscreen mode

Read

Load workbooks, worksheets, cells

spreadsheetLoader.ts

import * as XLSX from 'xlsx';

export async function load(file: File): Promise<void> {
    const buffer = await file.arrayBuffer();
    // Get workbook
    const workbook = XLSX.read(buffer, { type: 'buffer'});
    // Get worksheet
    const firstSheet = workbook.Sheets[workbook.SheetNames[0]];
    const cellB1: XLSX.CellObject = firstSheet['B1'];
    console.log(cellB1);

    // I can get cell address name from column and row indices.
    const cellB3: XLSX.CellObject = firstSheet[XLSX.utils.encode_cell({ c: 1, r: 2 })];
    console.log(cellB3);   
}
Enter fullscreen mode Exit fullscreen mode

Result

{
    h: "一回目"
    r: "<t>一回目</t><rPh sb=\"0\" eb=\"2\"><t>イッカイ</t></rPh><rPh sb=\"2\" eb=\"3\"><t>メ</t></rPh><phoneticPr fontId=\"1\"/>"
    t: "s"
    v: "一回目"
    w: "一回目"
    __proto__: Object
}
{
    h: "3.4MB"
    r: "<t>3.4MB</t><phoneticPr fontId=\"1\"/>"
    t: "s"
    v: "3.4MB"
    w: "3.4MB"
    __proto__: Object
}
Enter fullscreen mode Exit fullscreen mode

Reading file type

Although I can set the File instance as an argument of "XLSX.read" because the type is any, if I do so, I will get an runtime exception.
Because the method will try using "fs".

spreadsheetLoader.ts

export async function load(file: File): Promise<void> {
    const workbook = XLSX.read(file, { type: 'file'});
...
}
Enter fullscreen mode Exit fullscreen mode

Result

xlsx.js?1146:2730 Uncaught (in promise) TypeError: _fs.readFileSync is not a function
    at read_binary (xlsx.js?1146:2730)
    at Object.readSync (xlsx.js?1146:21018)
    at Object.eval (spreadsheetLoader.ts?0fa7:65)
    at step (spreadsheetLoader.ts?0fa7:52)
    at Object.eval [as next] (spreadsheetLoader.ts?0fa7:33)
    at eval (spreadsheetLoader.ts?0fa7:27)
    at new Promise (<anonymous>)
    at __awaiter (spreadsheetLoader.ts?0fa7:23)
    at Object.load (spreadsheetLoader.ts?0fa7:62)
    at eval (main.page.ts?2f3c:88)
...
Enter fullscreen mode Exit fullscreen mode

Get empty cell value?

If I try getting empty cell values, I will get "undefined".

spreadsheetLoader.ts

...
    const cellA1: XLSX.CellObject = firstSheet['A1'];
    // undefined
    console.log(cellA1);
...
Enter fullscreen mode Exit fullscreen mode

Get cell names

I can get cell names from XLSX.Workbook.

spreadsheetLoader.ts

...
...
    if(workbook.Workbook?.Names != null &&
            workbook.Workbook?.Names.length > 0) {
        // get names
        for(const n of workbook.Workbook.Names) {
            console.log(`Workbook Name: ${n.Name} Ref:${n.Ref}`);
        }
    }
...
Enter fullscreen mode Exit fullscreen mode

Result

Workbook Name: NAME_IN_BOOK Ref:Sheet1!$B$2
Workbook Name: NAME_IN_SHEET Ref:Sheet1!$C$2
Enter fullscreen mode Exit fullscreen mode

One problem is I can't separate the name is named in Workbooks or Worksheets.

Get dates

spreadsheetLoader.ts

...
    for(let i = 1; i < 5; i++) {
        const cellC: XLSX.CellObject = firstSheet[XLSX.utils.encode_cell({ c: 2, r: i })];
        console.log(cellC);
    }
...
Enter fullscreen mode Exit fullscreen mode

Result

{
    t: "n"
    v: 44398
    w: "Wednesday, July 21, 2021"
    __proto__: Object
}
{
    t: "n"
    v: 44398
    w: "7/21/21"
    __proto__: Object
} 
{
    t: "n"
    v: 44398
    w: "ggg2021年7月21日"
    __proto__: Object
}
{
    t: "n"
    v: 44398
    w: "2021年7月21日"
    __proto__: Object
}
Enter fullscreen mode Exit fullscreen mode

Because I can't get '令和3年7月21日' directly.
So I must get like below.

...
export async function load(file: File): Promise<void> {
...
    const cellC4: XLSX.CellObject = firstSheet['C4'];
    if(cellC4.w != null &&
            /(ggg)[0-9]{4}[0-9]+月[0-9]+日/.test(cellC4.w)) {
        const c4Value = getJapaneseCalender(cellC4.w);
        console.log(c4Value);
    }
}
function getJapaneseCalender(formattedDateText: string): string {
    /** because the text like '2021年7月21日' can't be converted to date,
     * I split the text and generate 'yyyy-M-d' and convert to date.
     */
    const yearMatch = formattedDateText.match(/[0-9]{4}年/);
    if(yearMatch == null) {
        return formattedDateText;
    }
    const monthMath = formattedDateText.match(/[0-9]+月/);
    if(monthMath == null) {
        return formattedDateText;
    }
    const dayMatch = formattedDateText.match(/[0-9]+日/);
    if(dayMatch == null) {
        return formattedDateText;
    }
    const yearText = yearMatch.toString().replace('', '');
    const monthText = monthMath.toString().replace('', '');
    const dayText = dayMatch.toString().replace('', '');
    const date = new Date(`${yearText}-${monthText}-${dayText}`);

    return new Intl.DateTimeFormat('ja-JP-u-ca-japanese', {era: 'long' }).format(date);
}
Enter fullscreen mode Exit fullscreen mode

Get and set formulas

spreadsheetLoader.ts

...
    // =SUM(D3:D9)
    const cellD10: XLSX.CellObject = firstSheet['D10'];
    console.log(cellD10);
    // =if
    const cellD11: XLSX.CellObject = firstSheet['D11'];
    console.log(cellD11);
    // =IFERROR(D11, 3)
    const cellD12: XLSX.CellObject = firstSheet['D12'];
    console.log(cellD12);

    // I can set formulas in cells. but it won't be executed before saving.
    XLSX.utils.sheet_set_array_formula(firstSheet, 'A2', 'SUM(D3:D4)');
    const cellA2: XLSX.CellObject = firstSheet['A2'];
    console.log(cellA2);
...
Enter fullscreen mode Exit fullscreen mode

Result

{
    f: "SUM(D3:D9)"
    t: "n"
    v: 19.14
    w: "19.14"
    __proto__: Object
}
{
    F: "D11"
    f: "if"
    t: "e"
    v: 29
    w: "#NAME?"
}
{
    f: "IFERROR(D11, 3)"
    t: "n"
    v: 3
    w: "3"
}
{
    F: "A2"
    f: "SUM(D3:D4)"
    t: "n"
    __proto__: Object
}
Enter fullscreen mode Exit fullscreen mode

Others

I can't treat shapes, images, etc.
And I also can't treat cell styles(borders, background colors and so on).

I felt at least now, and in web applications, if all I want to do are only getting cell texts and I don't want users to upload files, I can use SheetJS.

On other cases, I felt I should use ClosedXML, EPPlus or some other libraries in server side programs.

Discussion (0)