groupfilter
Filter by group
Syntax
Description
Table Data
returns the rows of table or timetableG
= groupfilter(T
,groupvars
,method
)T
that satisfy the group-wise filtering condition specified inmethod
. The filtering conditionmethod
is a function handle applied to each nongrouping variable. Groups are defined by rows in the variables ingroupvars
that have the same unique combination of values. For example,G = groupfilter (T)”,Trial",@(x) numel(x) > 5)
groups the data inT
byTrial
, and keeps the rows that belong to groups with more than five trials.
Array Data
returns the rows of vector or matrixB
= groupfilter(A
,groupvars
,method
)A
that satisfy the group-wise filtering condition specified inmethod
. The filtering conditionmethod
is a function handle applied to all column vectors. Groups are defined by rows in the column vectors ingroupvars
that have the same unique combination of values.
Examples
Filter Two-Variable Table
Create a table containing two variables.
groupID = [1 1 1 2 2 3]'; sample = [3 1 2 9 8 5]'; T = table(groupID,sample)
T=6×2 tablegroupID sample _______ ______ 1 3 1 1 1 2 2 9 2 8 3 5
组ID号,返回罗ws corresponding to groups with more than two samples.
Gnumel = groupfilter(T,"groupID",@(x) numel(x) > 2)
Gnumel=3×2 tablegroupID sample _______ ______ 1 3 1 1 1 2
Return rows whose group samples are between 0 and 6.
Gvals = groupfilter(T,"groupID",@(x) min(x) > 0 && max(x) < 6)
Gvals=4×2 tablegroupID sample _______ ______ 1 3 1 1 1 2 3 5
Group by Largest Values
Create a table containing two variables that represent a day number and temperature.
daynum = [1 1 1 1 2 2 2 2]'; temp = [67 65 71 55 61 79 58 78]'; T = table(daynum,temp)
T=8×2 tabledaynum temp ______ ____ 1 67 1 65 1 71 1 55 2 61 2 79 2 58 2 78
Group by day number, and return the largest two temperatures for each day.
G = groupfilter(T,"daynum",@(x) ismember(x,maxk(x,2)))
G=4×2 tabledaynum temp ______ ____ 1 67 1 71 2 79 2 78
Group by Month
Create a table of dates and corresponding profits.
timeStamps = datetime([2017 3 4; 2017 3 2; 2017 3 15; 2017 4 10;...2017 4 14; 2017 4 30; 2017 5 25;...2017 5 29; 2017 5 21]); profit = [2032 3071 1185 2587 1998 2899 3112 909 2619]'; T = table(timeStamps,profit)
T=9×2 tabletimeStamps profit ___________ ______ 04-Mar-2017 2032 02-Mar-2017 3071 15-Mar-2017 1185 10-Apr-2017 2587 14-Apr-2017 1998 30-Apr-2017 2899 25-May-2017 3112 29-May-2017 909 21-May-2017 2619
Group the dates by month, and return rows that correspond to the maximum profit for that month.
Gmax = groupfilter(T,"timeStamps","month",@(x) x == max(x))
Gmax=3×3 tabletimeStamps profit month_timeStamps ___________ ______ ________________ 02-Mar-2017 3071 Mar-2017 30-Apr-2017 2899 Apr-2017 25-May-2017 3112 May-2017
Return rows whose month had an average profit greater than $2300.
Gavg = groupfilter(T,"timeStamps","month",@(x) mean(x) > 2300)
Gavg=3×3 tabletimeStamps profit month_timeStamps ___________ ______ ________________ 10-Apr-2017 2587 Apr-2017 14-Apr-2017 1998 Apr-2017 30-Apr-2017 2899 Apr-2017
Filter Three-Variable Table
Create a tableT
that contains information about nine individuals.
groupID = [1 2 3 1 2 3 1 2 3]'; Height = [62 61 59 66 70 72 57 67 71]'; HealthStatus = categorical(["Poor";"Good";"Fair";"Poor";"Fair";"Excellent";"Poor";"Excellent";"Fair"]); T = table(groupID,Height,HealthStatus)
T=9×3 tablegroupID Height HealthStatus _______ ______ ____________ 1 62 Poor 2 61 Good 3 59 Fair 1 66 Poor 2 70 Fair 3 72 Excellent 1 57 Poor 2 67 Excellent 3 71 Fair
组ID号,返回罗ws for groups that contain only members with a minimum height of 60.
G1 = groupfilter(T,"groupID",@(x) min(x) >= 60,"Height")
G1=3×3 tablegroupID Height HealthStatus _______ ______ ____________ 2 61 Good 2 70 Fair 2 67 Excellent
组ID号,返回罗ws for groups that contain only members whose health status isPoor
.
G2 = groupfilter(T,"groupID",@(x) all(x =="Poor"),"HealthStatus")
G2=3×3 tablegroupID Height HealthStatus _______ ______ ____________ 1 62 Poor 1 66 Poor 1 57 Poor
Filter with Vector Data
Create a vector of dates and a vector of corresponding profit values.
timeStamps = datetime([2017 3 4; 2017 3 2; 2017 3 15; 2017 3 10;...2017 3 14; 2017 3 31; 2017 3 25;...2017 3 29; 2017 3 21; 2017 3 18]); profit = [2032 3071 1185 2587 1998 2899 3112 909 2619 3085]';
Group by day of the week, and compute the maximum profit for each group. Display the maximum profits and their corresponding groups.
[maxDailyProfit,dayOfWeek] = groupfilter(profit,timeStamps,..."dayname",@(x) x == max(x))
maxDailyProfit =5×13071 1185 2899 3112 2619
dayOfWeek =5x1 categoricalThursday Wednesday Friday Saturday Tuesday
Input Arguments
T
—Input table
table|timetable
Input table, specified as a table or timetable.
A
—Input array
column vector|matrix
Input array, specified as a column vector or a group of column vectors stored as a matrix.
groupvars
—分组变量或向量
scalar|vector|matrix|cell array|function handle|tablevartype
subscript
分组变量或向量, specified as one of these options:
For array input data,
groupvars
can be either a column vector with the same number of rows asA
or a group of column vectors arranged in a matrix or a cell array.For table or timetable input data,
groupvars
indicates which variables to use to compute groups in the data. You can specify the grouping variables with any of the options in this table.Option Description Examples Variable name A character vector or string scalar specifying a single table variable name
'Var1'
"Var1"
Vector of variable names A cell array of character vectors or string array, where each element is a table variable name
{'Var1' 'Var2'}
["Var1" "Var2"]
Scalar or vector of variable indices A scalar or vector of table variable indices
1
[1 3 5]
Logical vector A logical vector whose elements each correspond to a table variable, where
true
includes the corresponding variable andfalse
excludes it[true false true]
Function handle 一个函数处理,以表变量为我nput and returns a logical scalar
@isnumeric
vartype
subscriptA table subscript generated by the
vartype
functionvartype("numeric")
Example:groupfilter(T,"Var3",method)
method
—Filtering method
function handle
Filtering method, specified as a function handle.
method
defines the function used to filter out members from each group. The function must return a logical scalar or a logical column vector with the same number of rows as the input data indicating which group members to select.
If the function returns a logical scalar, then either all members of the group are filtered out (when the value is
false
) or none are filtered out (when the value istrue
).If the function returns a logical vector, then members of groups are filtered out when the corresponding element is
false
, and members are kept when the corresponding element istrue
.
To define the function handle, use a syntax of the form@(inputargs) mymethod
, wheremymethod
depends oninputargs
.
A function can filter for rows corresponding to groups that meet a condition. For example,
@(x) mean(x) > 10
passes to the output only rows corresponding to groups with a group mean greater than 10.A function can filter for rows that meet a condition within their corresponding group. For example,
@(x) x == max(x)
passes to the output only rows corresponding to the maximum value of rows within their group.
For more information, seeCreate Function HandleandAnonymous Functions.
Whengroupfilter
applies the method to more than one nongrouping variable at a time, the method returns a logical scalar or vector for each variable. For each row, the corresponding values in all returned scalars or vectors must betrue
to pass the row to the output.
groupbins
—Binning scheme
"none"
(default) |scalar|vector|cell array
Binning scheme, specified as one of these options:
"none"
, indicating no binningA list of bin edges, specified as a numeric vector, or a
datetime
vector fordatetime
grouping variables or vectorsA number of bins, specified as a positive integer scalar
A time duration, specified as a scalar of type
duration
orcalendarDuration
, indicating bin widths (fordatetime
orduration
grouping variables or vectors only)A cell array listing binning methods for each grouping variable or vector
A time bin for
datetime
andduration
grouping variables or vectors only, specified as one of these strings.Value Description Data Type "second"
Each bin is 1 second.
datetime
andduration
"minute"
Each bin is 1 minute.
datetime
andduration
"hour"
Each bin is 1 hour.
datetime
andduration
“天”
Each bin is 1 calendar day. This value accounts for daylight saving time shifts.
datetime
andduration
"week"
Each bin is 1 calendar week. datetime
only"month"
Each bin is 1 calendar month. datetime
only"quarter"
Each bin is 1 calendar quarter. datetime
only"year"
Each bin is 1 calendar year. This value accounts for leap days.
datetime
andduration
"decade"
Each bin is 1 decade (10 calendar years). datetime
only"century"
Each bin is 1 century (100 calendar years). datetime
only"secondofminute"
Bins are seconds from 0 to 59.
datetime
only"minuteofhour"
Bins are minutes from 0 to 59.
datetime
only"hourofday"
Bins are hours from 0 to 23.
datetime
only"dayofweek"
Bins are days from 1 to 7. The first day of the week is Sunday.
datetime
only"dayname"
Bins are full day names such as "Sunday"
.datetime
only"dayofmonth"
Bins are days from 1 to 31. datetime
only"dayofyear"
Bins are days from 1 to 366. datetime
only"weekofmonth"
Bins are weeks from 1 to 6. datetime
only"weekofyear"
Bins are weeks from 1 to 54. datetime
only“monthname”
Bins are full month names such as "January"
.datetime
only"monthofyear"
Bins are months from 1 to 12.
datetime
only"quarterofyear"
Bins are quarters from 1 to 4. datetime
only
When multiple grouping variables or vectors are specified, you can provide a single binning method that is applied to all grouping variables or vectors, or a cell array containing a binning method for each grouping variable or vector such as{"none",[0 2 4 Inf]}
.
datavars
—Table variables to operate on
scalar|vector|cell array|function handle|tablevartype
subscript
Table variables to operate on, specified as one of the options in this table.datavars
indicates which variables of the input table or timetable to apply the filtering methods to. Other variables not specified bydatavars
pass through to the output without being operated on.groupfilter
applies the filtering methods to the specified variables and uses the results to remove rows from all variables. Whendatavars
is not specified,groupfilter
operates on each nongrouping variable.
Option | Description | Examples |
---|---|---|
Variable name | A character vector or string scalar specifying a single table variable name |
|
Vector of variable names | A cell array of character vectors or string array, where each element is a table variable name |
|
Scalar or vector of variable indices | A scalar or vector of table variable indices |
|
Logical vector | A logical vector whose elements each correspond to a table variable, where |
|
Function handle | 一个函数处理,以表变量为我nput and returns a logical scalar |
|
vartype subscript |
A table subscript generated by the |
|
Example:groupfilter(T,groupvars,method,["Var1" "Var2" "Var4"])
LR
—Included bin edge
"left"
(default) |"right"
Included bin edge, specified as either"left"
or"right"
, indicating which end of the bin interval is inclusive.
This argument can be specified only whengroupbins
is specified, and the value applies to all binning schemes for all grouping variables or vectors.
Output Arguments
G
— Output table
table | timetable
Output table for table or timetable input data, returned as a table or timetable.G
contains the rows inT
that satisfy the group-wise filtering method.
B
— Output array
vector | matrix
Output array for array input data, returned as a vector or matrix.B
contains the rows inA
that satisfy the group-wise filtering method.
BG
— Grouping vectors
column vector | cell array of column vectors
Grouping vectors for array input data, returned as a column vector or cell array of column vectors.BG
contains the unique grouping vector or binned grouping vector combinations that correspond to the rows inB
.
Extended Capabilities
Tall Arrays
Calculate with arrays that have more rows than fit in memory.
Usage notes and limitations:
If
A
andgroupvars
are both tall matrices, then they must have the same number of rows.If the first input is a tall matrix, then
groupvars
can be a cell array containing tall grouping vectors.The
groupvars
anddatavars
arguments do not support function handles.The
method
argument must be a valid input forsplitapply
operating on a tall array.When grouping by discretized datetime arrays, the categorical group names are different compared to in-memory
groupfilter
calculations.
For more information, seeTall Arrays.
C/C++ Code Generation
Generate C and C++ code using MATLAB® Coder™.
Usage notes and limitations:
Sparse inputs are not supported.
Binning scheme is not supported for datetime or duration data.
Input tables that contain multidimensional arrays are not supported.
Computation methods must be constant.
Grouping variables must be constant when the first input argument is a table.
Data variables must be constant.
Binning scheme specified as character vectors or strings must be constant.
Name-value arguments must be constant.
Computation methods cannot return sparse or multidimensional results.
Thread-Based Environment
Run code in the background using MATLAB®backgroundPool
or accelerate code with Parallel Computing Toolbox™ThreadPool
.
This function fully supports thread-based environments. For more information, seeRun MATLAB Functions in Thread-Based Environment.
Version History
Introduced in R2019bR2022b:Code generation support
Generate C or C++ code for thegroupfilter
function. For usage notes and limitations, seeC/C++ Code Generation.
R2022a:Improved performance with small group size
Thegroupfilter
function shows improved performance, especially when the data count in each group is small.
For example, this code filters by group a matrix with 500 groups with a count of 10 each. The code is about 2.32x faster than in the previous release.
functiontimingGroupfilter data = (1:5000)'; groups = repelem(1:length(data)/10,10)'; p = randperm(length(data)); data = data(p); groups = groups(p); ticfork = 1:600 G = groupfilter(data,groups,@(x) x == max(x));endtocend
The approximate execution times are:
R2021b:2.32 s
R2022a:1.00 s
The code was timed on a Windows®10, Intel®Xeon®CPU E5-1650 v4 @ 3.60 GHz test system by calling thetimingGroupfilter
function.
See Also
Functions
Live Editor Tasks
MATLAB 명령
다음 MATLAB 명령에 해당하는 링크를 클릭했습니다.
명령을 실행하려면 MATLAB 명령 창에 입력하십시오. 웹 브라우저는 MATLAB 명령을 지원하지 않습니다.
Select a Web Site
选择一个网站get translated content where available and see local events and offers. Based on your location, we recommend that you select:.
You can also select a web site from the following list:
How to Get Best Site Performance
Select the China site (in Chinese or English) for best site performance. Other MathWorks country sites are not optimized for visits from your location.
Americas
- América Latina(Español)
- Canada(English)
- United States(English)
Europe
- Belgium(English)
- Denmark(English)
- Deutschland(Deutsch)
- España(Español)
- Finland(English)
- France(Français)
- Ireland(English)
- Italia(Italiano)
- Luxembourg(English)
- Netherlands(English)
- Norway(English)
- Österreich(Deutsch)
- Portugal(English)
- Sweden(English)
- Switzerland
- United Kingdom(English)