Microsoft.Data.Analysis
A DataFrame to support indexing, binary operations, sorting, selection and other APIs. This will eventually also expose an IDataView for ML.NET
A DataFrame to support indexing, binary operations, sorting, selection and other APIs. This will eventually also expose an IDataView for ML.NET
Wraps a around an Arrow without copying data
Returns an mostly without copying data
Performs an element-wise addition on each column
Performs an element-wise subtraction on each column
Performs an element-wise multiplication on each column
Performs an element-wise division on each column
Performs an element-wise modulus operation on each column
Performs an element-wise boolean And on each column
Performs an element-wise boolean Or on each column
Performs an element-wise boolean Xor on each column
Performs an element-wise left shift on each column
Performs an element-wise right shift on each column
Performs an element-wise equals on each column
Performs an element-wise not-equals on each column
Performs an element-wise greater than or equal on each column
Performs an element-wise less than or equal on each column
Performs an element-wise greater than on each column
Performs an element-wise less than on each column
Performs a reversed element-wise addition on each column
Performs a reversed element-wise subtraction on each column
Performs a reversed element-wise multiplication on each column
Performs a reversed element-wise division on each column
Performs a reversed element-wise modulus operation on each column
Performs a reversed element-wise boolean And on each column
Performs a reversed element-wise boolean Or on each column
Performs a reversed element-wise boolean Xor on each column
Constructs a with .
The columns of this .
Returns the columns contained in the as a
Returns a that contains a view of the rows in this
An Indexer to get or set values.
Zero based row index
Zero based column index
The value stored at the intersection of and
Returns a new DataFrame using the boolean values in
A column of booleans
Returns a new DataFrame using the row indices in
A column of row indices
Returns a new DataFrame using the row indices in
A column of row indices
Returns a new DataFrame using the boolean values in filter
A column of booleans
Returns a new DataFrame using the row indices in
A column of row indices
Returns a new DataFrame using the row indices in
A column of row indices
Returns a new DataFrame using the row indices in
Returns a new DataFrame using the row indices in
Returns a new DataFrame using the boolean values in
An indexer based on
The name of a
A if it exists.
Throws if is not present in this
Returns the first rows
Returns the last rows
Returns a full copy
Generates a concise summary of each column in the DataFrame
Generates descriptive statistics that summarize each numeric column
Orders the data frame by a specified column.
The column name to order by.
Sorting order.
If true, null values are always put at the end.
Orders the data frame by a specified column in descending order.
The column name to order by.
If true, null values are always put at the end.
Clamps values beyond the specified thresholds on numeric columns
Minimum value. All values below this threshold will be set to it
Maximum value. All values above this threshold will be set to it
Indicates if the operation should be performed in place
Adds a prefix to the column names
Adds a suffix to the column names
Returns a random sample of rows
Number of rows in the returned DataFrame
Groups the rows of the by unique values in the column.
The column used to group unique values
A GroupBy object that stores the group information.
Groups the rows of the by unique values in the column.
Type of column used for grouping
The column used to group unique values
A GroupBy object that stores the group information.
Returns a DataFrame with no missing values
Fills values with .
The value to replace with.
A boolean flag to indicate if the operation should be in place
A new if is not set. Returns this otherwise.
Fills values in each column with values from .
The values to replace with, one value per column. Should be equal to the number of columns in this .
A boolean flag to indicate if the operation should be in place
A new if is not set. Returns this otherwise.
Appends rows to the DataFrame
If an input column's value doesn't match a DataFrameColumn's data type, a conversion will be attempted
If a in is null, a null value is appended to each column
Values are appended based on the column names
The rows to be appended to this DataFrame
If set, appends in place. Otherwise, a new DataFrame is returned with the appended
culture info for formatting values
Appends a row to the DataFrame
If a column's value doesn't match its column's data type, a conversion will be attempted
If is null, a null value is appended to each column
If set, appends a in place. Otherwise, a new DataFrame is returned with an appended
Culture info for formatting values
Appends a row by enumerating column names and values from
If a column's value doesn't match its column's data type, a conversion will be attempted
An enumeration of column name and value to be appended
If set, appends in place. Otherwise, a new DataFrame is returned with an appended
Culture info for formatting values
Invalidates any cached data after a column has changed.
A preview of the contents of this as a string.
A preview of the contents of this .
A preview of the contents of this as a string.
Max amount of rows to show in preview.
Reads a text file as a DataFrame.
filename
column separator
has a header or not
column names (can be empty)
column types (can be empty)
number of rows to read
number of rows used to guess types
add one column with the row index
The character encoding. Defaults to UTF8 if not specified
If set to true, columns with repeated names are auto-renamed.
culture info for formatting values
DataFrame
return of if not null or empty, otherwise return "Column{i}" where i is .
column names.
column index.
Reads CSV data passed in as a string into a DataFrame.
csv data passed in as a string
column separator
has a header or not
column names (can be empty)
column types (can be empty)
number of rows to read not including the header(if present)
number of rows used to guess types
add one column with the row index
If set to true, columns with repeated names are auto-renamed.
culture info for formatting values
function used to guess the type of a column based on its values
Reads a seekable stream of CSV data into a DataFrame.
stream of CSV data to be read in
column separator
has a header or not
column names (can be empty)
column types (can be empty)
number of rows to read not including the header(if present)
number of rows used to guess types
add one column with the row index
The character encoding. Defaults to UTF8 if not specified
If set to true, columns with repeated names are auto-renamed.
culture info for formatting values
function used to guess the type of a column based on its values
Writes a DataFrame into a CSV.
CSV file path
column separator
has a header or not
The character encoding. Defaults to UTF8 if not specified
culture info for formatting values
Saves a DataFrame into a CSV.
CSV file path
column separator
has a header or not
The character encoding. Defaults to UTF8 if not specified
culture info for formatting values
Writes a DataFrame into a CSV.
stream of CSV data to be write out
column separator
has a header or not
the character encoding. Defaults to UTF8 if not specified
culture info for formatting values
Saves a DataFrame into a CSV.
stream of CSV data to be write out
column separator
has a header or not
the character encoding. Defaults to UTF8 if not specified
culture info for formatting values
Joins columns of another
The other to join.
The suffix to add to this 's column if there are common column names
The suffix to add to the 's column if there are common column names
The to use.
A new
Merge DataFrames with a database style join (for backward compatibility)
Options for DropNull().
"Any" drops a row if any of the row values are null.
"All" drops a row when all of the row values are null.
A basic mutable store to hold values in a DataFrame column. Supports wrapping with an ArrowBuffer
The base column type. All APIs should be defined here first
Performs element-wise addition
Performs an element-wise addition on each value in the column
Performs a reversed element-wise addition on each value in the column
Performs element-wise subtraction
Performs an element-wise subtraction on each value in the column
Performs a reversed element-wise subtraction on each value in the column
Performs element-wise multiplication
Performs an element-wise multiplication on each value in the column
Performs a reversed element-wise multiplication on each value in the column
Performs element-wise division
Performs an element-wise division on each value in the column
Performs a reversed element-wise division on each value in the column
Performs element-wise modulus
Performs an element-wise modulus operation on each value in the column
Performs a reversed element-wise modulus operation on each value in the column
Performs element-wise boolean And
Performs an element-wise boolean And on each value in the column
Performs a reversed element-wise boolean And on each value in the column
Performs element-wise boolean Or
Performs an element-wise boolean Or on each value in the column
Performs a reversed element-wise boolean Or on each value in the column
Performs element-wise boolean Xor
Performs an element-wise boolean Xor on each value in the column
Performs a reversed element-wise boolean Xor on each value in the column
Performs an element-wise left shift on each value in the column
Performs an element-wise right shift on each value in the column
Performs element-wise equals
Performs an element-wise equals on each value in the column
Performs element-wise not-equals
Performs an element-wise not-equals on each value in the column
Performs element-wise greater than or equal
Performs an element-wise greater than or equal on each value in the column
Performs element-wise less than or equal
Performs an element-wise less than or equal on each value in the column
Performs element-wise greater than
Performs an element-wise greater than on each value in the column
Performs element-wise less than
Performs an element-wise less than on each value in the column
Performs an element-wise equal to Null on each value in the column
Performs an element-wise not equal to Null on each value in the column
Updates each numeric element with its absolute numeric value
Returns whether all the elements are True
Returns whether any element is True
Updates each element with its cumulative maximum
Updates column values at rowIndices with its cumulative rowIndices maximum
Updates each element with its cumulative minimum
Updates column values at rowIndices with its cumulative rowIndices minimum
Updates each element with its cumulative product
Updates column values at rowIndices with its cumulative rowIndices product
Updates each element with its cumulative sum
Updates column values at rowIndices with its cumulative rowIndices sum
Returns the maximum of the values in the column
Returns the maximum of the values at rowIndices
Returns the minimum of the values in the column
Returns the minimum of the values at the rowIndices
Returns the product of the values in the column
Returns the product of the values at the rowIndices
Returns the sum of the values in the column
Returns the sum of the values at the rowIndices
Calls Math.Round on each value in a column
The base constructor.
The name of this column.
The length of this column.
The type of data this column holds.
A static factory method to create a .
It allows you to take advantage of type inference based on the type of the values supplied.
The type of the column to create.
The name of the column.
The initial values to populate in the column.
A populated with the provided data.
A static factory method to create a .
It allows you to take advantage of type inference based on the type of the values supplied.
The type of the column to create.
The name of the column.
The initial values to populate in the column.
A populated with the provided data.
A static factory method to create a .
It allows you to take advantage of type inference based on the type of the values supplied.
The name of the column.
The initial values to populate in the column.
A populated with the provided data.
The length of this column
The number of values in this column.
The column name.
Updates the column name.
The new name.
Updates the name of this column.
The new name.
Ignored (for backward compatibility)
Indicates if the value at this is valid (not ).
The index to look up.
A boolean value indicating the validity at this .
The type of data this column holds.
Indexer to get/set values at
The index to look up
The value at
Returns the value at .
The value at .
Returns number of values starting from .
The first index to return values from.
The number of values to return.
A read only list of values
Sets the value at with
The row index
The new value.
Returns number of values starting from .
The first index to return values from.
The number of values to return.
A read only list of values
Returns an enumerator that iterates this column.
Called internally from Append, Merge and GroupBy. Resizes the column to the specified length to allow setting values by indexing
The new length of the column
Clone column to produce a copy
A new
Clone column to produce a copy potentially changing the order of values by supplying mapIndices and an invert flag
A new
Clone column to produce a copy potentially changing the order of values by supplying mapIndices and an invert flag
A new
Returns a copy of this column sorted by its values.
Sorting order.
If true, null values are always put at the end.
Groups the rows of this column by their value.
The type of data held by this column
A mapping of value() to the indices containing this value. Should be sorted collection.
Get occurences of each value from this column in other column, grouped by this value
A mapping of index from this column to the indices of same value in other column
Get occurences of each value from this column in other column, grouped by this value
A mapping of index from this column to the indices of same value in other column
Returns a DataFrame containing counts of unique values
Returns a new column with elements replaced by .
Tries to convert value to the column's DataType
Indicates if the operation should be performed in place
Returns a with no missing values.
Returns the max number of values that are contiguous in memory
Creates a that will return the value of the column for the row
the cursor is referencing.
The row cursor which has the current position.
Adds a new to the specified builder for the current column.
The builder to which to add the schema column.
Appends a value to this using
The row cursor which has the current position
The cached ValueGetter for this column.
Returns the ValueGetter for each active column in as a delegate to be cached.
The row cursor which has the current position
The to return the ValueGetter for.
Clamps values beyond the specified thresholds
Minimum value. All values below this threshold will be set to it
Maximum value. All values above this threshold will be set to it
Indicates if the operation should be performed in place
Clamps values beyond the specified thresholds
Minimum value. All values below this threshold will be set to it
Maximum value. All values above this threshold will be set to it
Indicates if the operation should be performed in place
Returns a new column filtered by the lower and upper bounds
The minimum value in the resulting column
The maximum value in the resulting column
Returns a new column filtered by the lower and upper bounds
Determines if the column is of a numeric type
Returns the mean of the values in the column. Throws if this is not a numeric column
Returns the median of the values in the column. Throws if this is not a numeric column
Used to exclude columns from the Description method
Returns a containing the DataType and Length of this column
Returns a with statistics that describe the column
A preview of the contents of this as a string.
A preview of the contents of this .
Returns the indices that, when applied, result in this column being sorted./>.
Sorting order.
If true, null values are always put at the end.
A DataFrameColumnCollection is just a container that holds a number of DataFrameColumn instances.
Searches for a with the specified and returns the zero-based index of the first occurrence if found. Returns -1 otherwise
An indexer based on
The name of a
A if it exists.
Throws if is not present in this
Gets the with the specified .
The name of the column
.
A column named cannot be found, or if the column's type doesn't match.
Gets the with the specified .
The name of the column
.
A column named cannot be found, or if the column's type doesn't match.
Gets the with the specified .
The name of the column
.
A column named cannot be found, or if the column's type doesn't match.
Gets the with the specified .
The name of the column
.
A column named cannot be found, or if the column's type doesn't match.
Gets the with the specified .
The name of the column
.
A column named cannot be found, or if the column's type doesn't match.
Gets the with the specified and attempts to return it as an . If is not of type , an exception is thrown.
The name of the column
.
A column named cannot be found, or if the column's type doesn't match.
Gets the with the specified .
The name of the column
.
A column named cannot be found, or if the column's type doesn't match.
Gets the with the specified .
The name of the column
.
A column named cannot be found, or if the column's type doesn't match.
Gets the with the specified .
The name of the column
.
A column named cannot be found, or if the column's type doesn't match.
Gets the with the specified .
The name of the column
.
A column named cannot be found, or if the column's type doesn't match.
Gets the with the specified .
The name of the column
.
A column named cannot be found, or if the column's type doesn't match.
Gets the with the specified .
The name of the column
.
A column named cannot be found, or if the column's type doesn't match.
Gets the with the specified .
The name of the column
.
A column named cannot be found, or if the column's type doesn't match.
Gets the with the specified .
The name of the column
.
A column named cannot be found, or if the column's type doesn't match.
Gets the with the specified .
The name of the column
.
A column named cannot be found, or if the column's type doesn't match.
Gets the with the specified .
The name of the column
.
A column named cannot be found, or if the column's type doesn't match.
Gets the with the specified .
The name of the column
.
A column named cannot be found, or if the column's type doesn't match.
An immutable column to hold Arrow style strings
Constructs an empty with the given .
The name of the column.
Constructs an with the given , and . The , and are the contents of the column in the Arrow format.
The name of the column.
The Arrow formatted string values in this column.
The Arrow formatted offsets in this column.
The Arrow formatted null bits in this column.
The length of the column.
The number of values in this column.
Returns an enumeration of immutable buffers representing the underlying values in the Apache Arrow format
values are encoded in the buffers returned by GetReadOnlyNullBitmapBuffers in the Apache Arrow format
The offsets buffers returned by GetReadOnlyOffsetBuffers can be used to delineate each value
An enumeration of whose elements are the raw data buffers for the UTF8 string values.
Returns an enumeration of immutable buffers representing values in the Apache Arrow format
Each encodes the indices of values in its corresponding Data buffer
An enumeration of objects whose elements encode the null bit maps for the column's values
Returns an enumeration of immutable representing offsets into its corresponding Data buffer.
The Apache Arrow format specifies how the offset buffer encodes the length of each value in the Data buffer
An enumeration of objects.
Indexer to get values. This is an immutable column
Zero based row index
The value stored at this
Returns number of values starting from .
The index of the first value to return.
The number of values to return starting from
A new list of string values
Returns an enumerator that iterates through the string values in this column.
Returns a boolean column that is the result of an elementwise equality comparison of each value in the column with
Returns a boolean column that is the result of an elementwise not-equal comparison of each value in the column with
Applies a function to all the values
The function to apply
A containing the new string values
This function converts from UTF-8 to UTF-16 strings
A mutable column to hold strings
Is NOT Arrow compatible
Applies a function to all values in the column, that are not null.
The function to apply.
/// A boolean flag to indicate if the operation should be in place.
A new if is not set. Returns this column otherwise.
Returns a new column with elements replaced by .
Tries to convert value to the column's DataType
Indicates if the operation should be performed in place
Column to hold VBuffer
Constructs an empty VBufferDataFrameColumn with the given .
The name of the column.
Length of values
Returns an enumerator that iterates through the VBuffer values in this column.
A DataFrameRow is a collection of values that represent a row in a .
Returns an enumerator of the values in this row.
An indexer to return the value at .
The index of the value to return
The value at this .
An indexer to return the value at .
The name of the column that corresponds to the return value
The value at this .
A simple string representation of the values in this row
Represents the rows of a
Initializes a .
An indexer to return the at
The row index
Returns an enumerator of objects
The number of rows in this .
A GroupBy class that is typically the result of a DataFrame.GroupBy call.
It holds information to perform typical aggregation ops on it.
Compute the number of non-null values in each group
Return the first value in each group
Returns the first rows of each group
Returns the last rows of each group
Compute the max of group values
The columns to find the max of. A default value finds the max of all columns
Compute the min of group values
The columns to find the min of. A default value finds the min of all columns
Compute the product of group values
The columns to find the product of. A default value finds the product of all columns
Compute the sum of group values
The columns to sum. A Default value sums up all columns
Compute the mean of group values
The columns to find the mean of. A Default value finds the mean of all columns
Returns a collection of Grouping objects, where each object represent a set of DataFrameRows having the same Key
PrimitiveColumnContainer is just a store for the column data. APIs that want to change the data must be defined in PrimitiveDataFrameColumn
A null value has an unset bit
A NON-null value has a set bit
A column to hold primitive types such as int, float etc.
Returns an enumerable of immutable memory buffers representing the underlying values
values are encoded in the buffers returned by GetReadOnlyNullBitmapBuffers in the Apache Arrow format
IEnumerable
Returns an enumerable of immutable buffers representing values in the Apache Arrow format
Each encodes the values for its corresponding Data buffer
IEnumerable
Returns a new column with elements replaced by .
Indicates if the operation should be performed in place.
Returns a clone of this column.
Returns a clone of this column.
A column who values are used as indices
Applies a function to all column values in place.
The function to apply
Applies a function to all values in the column, that are not null.
The function to apply.
/// A boolean flag to indicate if the operation should be in place.
A new if is not set. Returns this column otherwise.
Applies a function to all column values.
The new column's type
The function to apply
A new PrimitiveDataFrameColumn containing the new values
Clamps values beyond the specified thresholds
Minimum value. All values below this threshold will be set to it
Maximum value. All values above this threshold will be set to it
Indicates if the operation should be performed in place
Returns a new column filtered by the lower and upper bounds
The minimum value in the resulting column
The maximum value in the resulting column
A basic immutable store to hold values in a DataFrame column. Supports wrapping with an ArrowBuffer
Peek at characters of the next data line without reading the line
The number of characters to look at in the next data line.
A string consisting of the first characters of the next line. >If numberOfChars is greater than the next line, only the next line is returned
Set the number of bits in a span of bytes starting
at a specific index, and limiting to length.
Span to set bits value.
Bit index to start counting from.
Maximum of bits in the span to consider.
Bit value.
Returns the population count (number of bits set) in a span of bytes starting
at 0 bit and limiting to length of bits.
A strongly-typed resource class, for looking up localized strings, etc.
Returns the cached ResourceManager instance used by this class.
Overrides the current thread's CurrentUICulture property for all
resource lookups using this strongly typed resource class.
Looks up a localized string similar to ... {0} of total {1}.
Looks up a localized string similar to Cannot cast column holding {0} values to type {1}.
Looks up a localized string similar to Cannot cast elements of column '{0}' type of {1} to type {2} used as TKey in grouping .
Looks up a localized string similar to Line {0} cannot be parsed with the current Delimiters..
Looks up a localized string similar to Line {0} cannot be parsed with the current FieldWidths..
Looks up a localized string similar to Cannot resize down.
Looks up a localized string similar to Comment token cannot contain whitespace.
Looks up a localized string similar to DataType.
Looks up a localized string similar to Delimiter cannot be new line characters.
Looks up a localized string similar to Length (excluding null values).
Looks up a localized string similar to DataFrame already contains a column called {0}.
Looks up a localized string similar to Delimiters is empty..
Looks up a localized string similar to FieldWidths is empty..
Looks up a localized string similar to Empty file.
Looks up a localized string similar to Exceeded maximum buffer size..
Looks up a localized string similar to Parameter.Count exceeds the number of columns({0}) in the DataFrame .
Looks up a localized string similar to Parameter.Count exceeds the number of rows({0}) in the DataFrame .
Looks up a localized string similar to Expected either {0} or {1} to be provided.
Looks up a localized string similar to {0} not found..
Looks up a localized string similar to A double quote is not a legal delimiter when HasFieldsEnclosedInQuotes is set to True..
Looks up a localized string similar to Column is immutable.
Looks up a localized string similar to Inconsistent null bitmap and data buffer lengths.
Looks up a localized string similar to Inconsistent null bitmaps and NullCounts.
Looks up a localized string similar to Index cannot be greater than the Column's Length.
Looks up a localized string similar to Column '{0}' does not exist.
Looks up a localized string similar to All field widths, except the last element, must be greater than zero. A field width less than or equal to zero in the last element indicates the last field is of variable length..
Looks up a localized string similar to Line {0} has less columns than expected.
Looks up a localized string similar to Line {0} cannot be read because it exceeds the max line size..
Looks up a localized string similar to MapIndices exceeds column length.
Looks up a localized string similar to Array lengths are mistmached.
Looks up a localized string similar to Column lengths are mismatched.
Looks up a localized string similar to Expected column to hold values of type {0}.
Looks up a localized string similar to rowCount differs from Column length for Column .
Looks up a localized string similar to Expected value to be of type {0}.
Looks up a localized string similar to Expected value to be of type {0}, {1} or {2}.
Looks up a localized string similar to Expected a seekable stream.
Looks up a localized string similar to {0} is not a supported column type..
Looks up a localized string similar to Delimiters is null..
Looks up a localized string similar to FieldWidths is null..
Looks up a localized string similar to numeric column.
Looks up a localized string similar to {0} must be greater than 0.
Looks up a localized string similar to Cannot span multiple buffers.
Looks up a localized string similar to Stream doesn't support reading.
Looks up a localized string similar to Specified vector subtype {0} is not supported..
Returns a from this .
The current .
The max number or rows in the . Defaults to 100. Use -1 to construct a DataFrame using all the rows in .
A with .
Returns a with the first 100 rows of this .
The current .
The columns selected for the resultant DataFrame
A with the selected columns and 100 rows.
Returns a with the first of this .
The current .
The max number or rows in the . Use -1 to construct a DataFrame using all the rows in .
The columns selected for the resultant DataFrame
A with the selected columns and rows.