主要内容

高阵列的可视化

可视化大数据集需要以某种方式汇总,归纳或采样数据,以减少屏幕上绘制的点数。在某些情况下,诸如直方图馅饼bin the data to reduce the size, while other functions such asplot分散use a more complex approach that avoids plotting duplicate pixels on the screen. For problems where the pixel overlap is relevant to the analysis, thebinscatter功能还提供了一种可视化密度模式的有效方法。

Visualizing tall arrays doesnot需要使用gather。MATLAB®immediately evaluates and displays visualizations of tall arrays. Currently, you can visualize tall arrays using the functions and methods in this table.

功能 Required Toolboxes Notes
plot

These functions plot in iterations, progressively adding to the plot as more data is read. During the updates, a progress indicator shows the proportion of data that has been plotted. Zooming and panning is supported during the updating process, before the plot is complete. To stop the update process, press the pause button in the progress indicator.

分散
binscatter
直方图
直方图2
馅饼

仅用于可视化分类数据。

BINSCATTERPLOT(Statistics and Machine Learning Toolbox) 统计和机器学习工具箱™

Figure contains a slider to control the brightness and color detail in the image. The slider adjusts the value of the伽玛图像校正参数。

ksdensity(Statistics and Machine Learning Toolbox) 统计和机器学习工具箱

Produces a probability density estimate for the data, evaluated at 100 points for univariate data, or 900 points for bivariate data.

datasample(Statistics and Machine Learning Toolbox) 统计和机器学习工具箱

datasampleenables you to extract a subsample of a tall array in a statistically sound way compared to simple indexing. If the subset of data is small enough to fit in memory, then you can use plotting and fitting functions on the subset that do not directly support tall arrays.

Tall Array Plotting Examples

此示例显示了几种可视化高阵列的不同方式。

AirlinesMall.CSVdata set, which contains rows of airline flight data. Select a subset of the table variables to work with and remove rows that contain missing values.

ds = tabularTextDatastore('airlinesmall.csv','TreatAsMissing','na');ds.selectedVariablenames = {“年”,'Month','arrdelay','DepDelay','起源','dest'}; T = tall(ds); T = rmmissing(T)
T = Mx6 tall table Year Month ArrDelay DepDelay Origin Dest ____ _____ ________ ________ _______ _______ 1987 10 8 12 {'LAX'} {'SJC'} 1987 10 8 1 {'SJC'} {'BUR'} 1987 10 21 20 {'SAN'} {'SMF'} 1987 10 13 12 {'BUR'} {'SJC'} 1987 10 4 -1 {'SMF'} {'LAX'} 1987 10 59 63 {'LAX'} {'SJC'} 1987 10 3 -2 {'SAN'} {'SFO'} 1987 10 11 -1 {'SEA'} {'LAX'} : : : : : : : : : : : :

按月航班的饼图

Convert the numericMonthvariable into a categorical variable that reflects the name of the month. Then plot a pie chart showing how many flights are in the data for each month of the year.

T.Month = categorical(T.Month,1:12,{'扬','feb','Mar','apr','May','Jun','Jul','Aug','Sep','oct','Nov','dec'})
t = mx6高桌子年度ard ardelay depdelay起源____ ______________________________________________________________________________________________________________11878111871987198719871987198719871987198719878片{'1987年12月8日12 {'lax'} {'sjc'} 1987年10月8日1 {'sjc'} {'sjc'} {'bur'}'san'} {'smf'} 1987年10月13日12 {'bur'} {'sjc'} 1987 oct 4 -1 {'smf'} {'lax'} 1987 oct 59 63 {'lax'} {'sjc {'sjc {'sjc'} 1987年10月3日-2 {'san'} {'sfo'} 1987年10月11日-1 {'sea'} {'lax'} :: :: :: :: :: :: ::: :: :: :: ::: :: :: :: :: :: :: ::
馅饼(T.Month)
使用本地MATLAB会话评估高高的表达: - 通过2:of 2:在1.3秒完成 - 第2次,共2个:在3.1秒内完成的1秒评估完成

Histogram of Delays

Plot a histogram of the arrival delays for each flight in the data. Since the data has a long tail, limit the plotting area using theBinLimitsname-value pair.

直方图(T.ArrDelay,“二手限制”,[-50 150])
使用本地MATLAB会话评估高高的表达: - 通过2:of 2:在2.3秒内完成 - 第2秒:完成在3.9秒内完成的0.86秒评估

图包含一个轴对象。The axes object contains an object of type histogram.

scatter

Plot a scatter plot of arrival and departure delays. You can expect a strong correlation between these variables since flights that leave late are also likely to arrive late.

When operating on tall arrays, theplot,分散, 和binscatter功能绘制迭代中的数据,随着读取更多数据的读取,逐渐添加到图中。在更新过程中,图的顶部有一个进度指标,显示绘制了多少数据。在绘图完成之前,在更新过程中支持缩放和平宁。万博1manbetx

散布(T.Arrdelay,T.Depdelay)Xlabel(“到达延迟”)ylabel('Departure Delay')xlim([-140 1000])Ylim([-140 1000])

图包含一个轴对象。The axes object contains an object of type scatter.

The progress bar also includes aPause/Resume按钮。显示足够的数据后,请使用按钮尽早停止图。

适合趋势线

Use thepolyfit多腔functions to overlay a linear trend line on the plot of arrival and departure delays.

holdp = polyFit(T.arrdelay,T.Depdelay,1);x = stort(t.arrdelay,1);yp = polyval(p,x);情节(x,yp,'r-') 抓住离开

图包含一个轴对象。The axes object contains 2 objects of type scatter, line.

Visualize Density

点的散点图有助于到一定点,但是如果点广泛重叠,则很难从图中解密信息。在这种情况下,它有助于可视化图中点的点密度以发现趋势。

Use thebinscatter在到达和出发延迟图中可视化点密度的功能。

binscatter(T.ArrDelay,T.DepDelay,'XLimits',[-100 1000],'YLimits',[ -  100 1000])xlim([ -  100 1000])ylim([ -  100 1000])xlabel(“到达延迟”)ylabel('Departure Delay')

图包含一个轴对象。轴对象包含类型binscatter的对象。

调整攀登轴的属性使所有大于150的bin值都相同。这样可以防止几个具有非常大值的垃圾箱主导地块。

ax = gca;ax.clim = [0 150];

图包含一个轴对象。轴对象包含类型binscatter的对象。

See Also

||

相关话题