BIG DATA VISUALIZATION METHODS-Part II
- Circular Network Diagram.
In this method, data objects are placed around a circle which is linked by curves based on the rate of their relativeness. Measurement of object relativeness is calculating according to the different line width or color saturation.
In this method it usually provides interactions which are making unnecessary links invisible and highlighting selected one. So, this method underlines direct relation between multiple objects and shows how relative it is
Product transfer diagram between cities, relations between the bought products in different shops are typical use-cases for this method. Also, it allows us to represent aggregated data as a set of arcs that are between analyzed data objects. Therefore analysts can get quantity information about relations between objects. In addition, this approach can be applied large data volumes and it is placing data objects by circle radius and varying ark area of objects. Also, there can be additional information that is shown near an arc and it can be provided from other factors of data objects. One of the most important things is there are no limitations in using only one factor per diagram because we can always put different factors of objects and build relations between them. In some cases, this approach will produce enough information to change the direction of an analyst’s research or to make a final decision but still, this method can be difficult to percept and understand. Also, this method satisfies the data variety criterion.
The circular form encourages eye movement to proceed along curved lines, rather than in a zigzag fashion in a square or rectangular figure
And, because of the whole data representation, every single change in data must be followed by the repainting of the diagram.
Method advantages:
(i) It allows us to make relative data representation, which can be easily precepted.
(ii) Within the circle, the resolution varies linearly, increasing with the radial position. This makes the center of the circle ideal for compactly displaying summary statistics or indicating points of interest.
Method disadvantages:
(i) The method may end in imperceptible representation form and may need regrouping of data objects on the screen.
(ii) Objects with the smallest parameter weight can be suppressed by larger ones.
- Parallel Coordinates.
Parallel Coordinates allows visual analysis to be extended with multiple data factors for different objects.
All data factors to be analyzed are placed on one of the axis, and the corresponding values of data object in relative scale are placed on the other. Each data object is represented by a series of linked traverse lines, showing its place in context of other objects.
This method allows us to use only a thick line on screen to represent individual data object and this approach allows it to meet the first criterion — large data volumes
One extension of standard 2D parallel coordinates is the multi-relational 3D parallel coordinates. Here, the axes are placed, equally separated, on a circle with a focus axis in the center.
A data item is again displayed as a series of line segments intersecting all axes. This axis configuration has the advantage that all pairwise relationships between the focus variable in the center and all outer variables can be investigated simultaneously .
Parallel coordinates can handle several factors for many objects per single screen because this method is based on relative values which require the calculation of minimum and maximum values for each factor. So, it satisfies the data variety criterion.
There is no need for the repainting of all images, while values are changing between the minimum and maximum values of each factor. But in a case of that value exceeds this limit, we have to repaint the image to show adequate visualization.
That approach can be used for visualization of dynamic data. The second way to represent a data in time is to use three-dimensional extensions for polar coordinates method.
Method advantages:
(i) Factors order does not influence total diagram perceptions (ii) method allows us to analyze both whole data set of objects at once and individual data objects
Method disadvantages:
(i) The method has limitation to the number of factors, shown at once.
(ii) Visualization dynamic data end up in changing whole data representation
References
[1] E. Y. Gorodov and V. V. Gubarev,”Analytical Review of Data Visualization Methods in Application to Big Data,” Journal of Electrical and Computer Engineering, vol. 2013, p. 7, 2013.
[2] W. S. Cleveland and R. McGill, “Theory, experimentation, and application to the development of graphical methods,” Journal of the American Statistical Association, vol. 79, no. 387, 1984
[3] J. Tedesco, A. Sharma, and R. Dudko, “Theius: a streaming visualization suite for hadoop clusters,” in Proceedings of the IEEE International Conference on Cloud Engineering, 2013.
[4] N. Cawthon and A. V. Moere, The Effect of Aesthetic on the Usability of Data Visualization, http://web.arch.usyd.edu.au/∼ andrew/publications/iv07b.pdf. [5] Circos, Benefits of a Circular Layout, http://circos.ca/intro/ circular approach/.