DEV Community

Judy
Judy

Posted on

Java, perform COUNT on each group of a large csv file #eg33

data.csv is a large csv file that cannot fit into the memory; its 3rd column is the grouping column, as shown below:

Date,Time,Sub User,Access Method

10-10-2023,00:03:06,JL,cli

10-10-2023,00:02:20,TW2JL,app

10-10-2023,00:03:26,JL,cli

10-10-2023,00:03:34,JL,cli

10-10-2023,00:03:35,JL,cli

10-10-2023,00:03:46,JL,cli

10-10-2023,00:04:09,JL,cli

10-10-2023,00:04:51,JL,cli

10-10-2023,00:04:56,JL,cli

10-10-2023,00:05:58,JL,cli

10-10-2023,00:06:29,JL,cli

10-10-2023,00:06:42,JL,cli

10-10-2023,00:26:35,TW2JL,app

10-10-2023,00:30:01,TW2JL,app

10-10-2023,00:30:02,TW2JL,app

10-10-2023,00:30:05,TW2JL,app

10-10-2023,00:33:42,TW2JL,app

10-10-2023,00:36:36,TW2JL,app

10-10-2023,00:45:10,TW2JL,app

10-10-2023,00:53:01,TW2JL,app

10-10-2023,00:53:24,TW2JL,app

10-10-2023,01:03:14,TW2JL,app

10-10-2023,01:03:18,TW2JL,app

10-10-2023,01:03:20,TW2JL,app

Task: Use Java to group values in the 3rd column and count record in each group. Below is the expected result:

Sub User cnt

JL 11

TW2JL 13

Write the following SPL statement:

=T@c(""data.csv"").groups("'Sub User"';count(1):cnt)

T()function parses the csv file; @c option enables using the cursor mode. groups() function performs grouping and aggregation.

Read How to Call a SPL Script in Java to find how to integrate SPL into a Java application.

Source

SPL open source address

Top comments (0)