Basic Usage

Import the library

// es6
import DataFrame from 'dataframe-js';
import DataFrame, { Row } from 'dataframe-js';
// es5
var DataFrame = require('dataframe-js').DataFrame;
// Browser
var DataFrame = dfjs.DataFrame;

Create a DataFrame

You can create a DataFrame by using mutiple ways:

const df = new DataFrame(data, columns);

// From a collection (easier)
const df = new DataFrame([
    {c1: 1, c2: 6}, // <------- A row
    {c4: 1, c3: 2}
], ['c1', 'c2', 'c3', 'c4']);

// From a table
const df = new DataFrame([
    [1, 6, 9, 10, 12], // <------- A row
    [1, 2],
    [6, 6, 9, 8, 9, 12],
], ['c1', 'c2', 'c3', 'c4', 'c5', 'c6']);

// From a dictionnary (Hash)
const df = new DataFrame({
    column1: [3, 6, 8], // <------ A column
    column2: [3, 4, 5, 6],
}, ['column1', 'column2']);

// From files
DataFrame.fromText('/my/absolue/path/myfile.txt').then(df => df);
DataFrame.fromDSV('/my/absolue/path/myfile.txt').then(df => df);
DataFrame.fromPSV('http://myurl/myfile.psv').then(df => df);
DataFrame.fromTSV('http://myurl/myfile.tsv').then(df => df);
DataFrame.fromCSV('http://myurl/myfile.csv').then(df => df);
DataFrame.fromJSON('http://myurl/myfile.json').then(df => df);
DataFrame.fromJSON(new File(...)).then(df => df);

Export or Convert a DataFrame

In the same way, you can also export or convert your DataFrame in files or in JavaScript Objects:

const df = new DataFrame(data, columns);

// To native objects
df.toCollection();
df.toArray();
df.toDict();

// To files
DataFrame.toText(true, ';', '/my/absolue/path/myfile.txt');
DataFrame.toDSV(true, ';', '/my/absolue/path/myfile.txt');
DataFrame.toCSV(true, '/my/absolue/path/myfile.csv');
DataFrame.toTSV(true, '/my/absolue/path/myfile.tsv');
DataFrame.toPSV(true, '/my/absolue/path/myfile.psv');
DataFrame.toJSON(true, '/my/absolue/path/myfile.json');

DataFrame

The main Object of the dataframe-js library is the DataFrame. It provides 3 types of methods:

Informations, giving details about your DataFrame.

// Some examples
df.show();
df.dim();

Columns manipulations, which provide solutions to select, reorganize, cast, join or analyze your data...

// Some examples
df.select('column1', 'column3');
df.cast('column3', String);
df.distinct('column2');
df.innerJoin(df2, ['column2', 'column3']);

Rows manipulations, which provide ways to filter, modify, join, complete your data...

// Some examples
df.push([1, 2, 3], [4, 5, 6]);
df.map(row => row.set('column2', row.get('column1') * 2));
df.filter(row => row.get('column2') !== 4);
df.union(df2);

Row

As you could see, the Row api is used for example with .map(), .filter() DataFrame methods DataFrame.

The Row API provides simple manipulations, to get, set delete or check data in each line of your DataFrames.

// Some examples
row.get('column1');
row.set('column2', newValue);

GroupedDataFrame

When you use the DataFrame .groupBy() method, new GroupedDataFrame object is created. It can be used to create DataFrame aggregations (like SQL) in order to resume your data.

Each group in the GroupedDataFrame is a DataFrame. When you aggregate a GroupedDataFrame Object, you get a DataFrame with one line per group, and with a new column "aggregation".

// Some examples
const groupedDF = df.groupBy('column1', 'column2');
groupedDF.aggregate(group => group.count()).rename('aggregation', 'groupCount');
df.groupBy('column2', 'column3').aggregate(group => group.stat.mean('column4')).rename('aggregation', 'groupMean');

Stat Module

The Stat module provides basic statistical computations on a DataFrame columns.

// Some examples
df.stat.max('column1');
df.stat.mean('column1');

Matrix Module

The Matrix module provides mathematical matrix operations between DataFrames.

// Some examples
df.matrix.add(df2);
df.matrix.product(8);
df.matrix.dot(df2);

SQL Module

To finish, the SQL module allows you to register temporary tables and to request on these, by using SQL syntax.

// Some examples
// Register a tmp table
df.sql.register('tmp2')
DataFrame.sql.registerTable(df, 'tmp2')
// Request on Table
DataFrame.sql.request('SELECT * FROM tmp2 WHERE column1 = 6')

results matching ""

    No results matching ""