Bringing introspection into BlobSeer: Towards a self-adaptive distributed data management system

Carpen-Amarie, Alexandra; Costan, Alexandru; Cai, Jing; Antoniu, Gabriel; Bougé, Luc

doi:10.2478/v10006-011-0017-y

Artykuł - szczegóły

Czasopismo

International Journal of Applied Mathematics and Computer Science

2011 | 21 | 2 | 229-242

Tytuł artykułu

Bringing introspection into BlobSeer: Towards a self-adaptive distributed data management system

Autorzy

Alexandra Carpen-Amarie , Alexandru Costan , Jing Cai , Gabriel Antoniu , Luc Bougé

Treść / Zawartość

Pełne teksty:

http://matwbn.icm.edu.pl/ksiazki/amc/amc21/amc2122.pdf [zdalny]

Języki publikacji

EN

Abstrakty

EN

Introspection is the prerequisite of autonomic behavior, the first step towards performance improvement and resource usage optimization for large-scale distributed systems. In grid environments, the task of observing the application behavior is assigned to monitoring systems. However, most of them are designed to provide general resource information and do not consider specific information for higher-level services. More precisely, in the context of data-intensive applications, a specific introspection layer is required to collect data about the usage of storage resources, data access patterns, etc. This paper discusses the requirements for an introspection layer in a data management system for large-scale distributed infrastructures. We focus on the case of BlobSeer, a large-scale distributed system for storing massive data. The paper explains why and how to enhance BlobSeer with introspective capabilities and proposes a three-layered architecture relying on the MonALISA monitoring framework. We illustrate the autonomic behavior of BlobSeer with a self-configuration component aiming to provide storage elasticity by dynamically scaling the number of data providers. Then we propose a preliminary approach for enabling self-protection for the BlobSeer system, through a malicious client detection component. The introspective architecture has been evaluated on the Grid'5000 testbed, with experiments that prove the feasibility of generating relevant information related to the state and behavior of the system.

Słowa kluczowe

EN

distributed system storage management large-scale system monitoring introspection

Wydawca

University of Zielona Gora Press

Czasopismo

International Journal of Applied Mathematics and Computer Science

Rocznik

2011

Tom

21

Numer

2

Strony

229-242

Opis fizyczny

Daty

wydano

2011

otrzymano

2010-07-01

poprawiono

2010-12-06

poprawiono

2011-01-21

Twórcy

autor

Alexandra Carpen-Amarie

INRIA Rennes-Bretagne Atlantique/IRISA, Campus Universitaire de Beaulieu, 35042 Rennes, France

autor

Alexandru Costan

Polytechnic University of Bucharest, Department of Computer Science, 313 Spl. Independentei, 060042 Bucharest, Romania

autor

Jing Cai

Department of Computer Science, City University of Hong Kong, 83 Tat Chee Avenue, Kowloon, Hong Kong, China

autor

Gabriel Antoniu

INRIA Rennes-Bretagne Atlantique/IRISA, Campus Universitaire de Beaulieu, 35042 Rennes, France

autor

Luc Bougé

Ecole Normale Supérieure de Cachan, Antenne de Bretagne/IRISA, Campus Universitaire de Beaulieu, 35042 Rennes, France

Bibliografia

Albrecht, J., Oppenheimer, D., Vahdat, A. and Patterson, D.A. (2005). Design and implementation tradeoffs for wide-area resource discovery, Proceedings of 14th IEEE Symposium on High Performance, Research Triangle Park, NC, USA, pp. 113-124.
ALICE (2010). The MonALISA Repository for ALICE, http://pcalimonitor.cern.ch/map.jsp.
Andreozzi, S., De Bortoli, N., Fantinel, S., Ghiselli, A., Rubini, G.L., Tortone, G. and Vistoli, M. C. (2005). GridICE: A monitoring service for grid systems, Future Generation Computer Systems 21(4): 559-571.
Bolze, R., Cappello, F., Caron, E., Dayd, M.J., Desprez, F., Jeannot, E., Jgou, Y., Lanteri, S., Leduc, J., Melab, N., Mornet, G., Namyst, R., Primet, P., Qutier, B., Richard, O., Talbi, E., and Touche, I. (2006). Grid'5000: A large scale and highly reconfigurable experimental grid testbed, International Journal of High Performance Computing Applications 20(4): 481-494.
Cardosa, M. and Chandra, A. (2008). Resource bundles: Using aggregation for statistical wide-area resource discovery and allocation, 28th IEEE International Conference on Distributed Computing Systems (ICDCS 2008), Beijing, China, pp. 760-768.
Carpen-Amarie, A., Cai, J., Costan, A., Antoniu, G. and Bougé, L. (2010). Bringing introspection into the BlobSeer data-management system using the MonALISA distributed monitoring framework, 1st International Workshop on Autonomic Distributed Systems (ADiS 2010), Cracow, Poland, pp. 508-513.
Cooke, A., Gray, A., Nutt, W., Magowan, J., Oevers, M., Taylor, P., Cordenonsi, R., Byrom, R., Cornwall, L., Djaoui, A., Field, L., Fisher, S., Hicks, S., Leake, J., Middleton, R., Wilson, A., Zhu, X., Podhorszki, N., Coghlan, B., Kenny, S., Callaghan, D.O. and Ryan, J. (2004). The relational grid monitoring architecture: Mediating information about the grid, Journal of Grid Computing 2(4): 323-339.
Cowell, R.G., Dawid, A.P., Lauritzen, S.L. and Spiegelhalter, D.J. (1999). Probabilistic Networks and Expert Systems, Springer-Verlag, New York, NY.
Ding, J., Krämer, B.J., Bai, Y. and Chen, H. (2004). Probabilistic inference for network management, in M.M. Freie, P. Chemovil, P. Lorenz and A. Gravey (Eds.), Universal Multiservice Networks, Lecture Notes in Computer Science, Vol. 3262, Springer, Berlin/Heidelberg, pp. 498-507.
GGF (2010). The Global Grid Forum, http://www.ggf.org/.
Gunter, D., Tierney, B., Crowley, B., Holding, M. and Lee, J. (2000). Netlogger: A toolkit for distributed system performance analysis, MASCOTS '00: Proceedings of the 8th International Symposium on Modeling, Analysis and Simulation of Computer and Telecommunication Systems, San Francisco, CA, USA, pp. 267-273.
Gurguis, S. and Zeid, A. (2005). Towards autonomic web services: Achieving self-healing using web services, DEAS05: Proceedings of Design and Evolution of Autonomic Application Software Conference, St. Louis, MO, USA, pp. 1-5.
Hood, C. and Ji, C. (1997). Automated proactive anomaly detection, Proceedings of the IEEE International Conference of Network Management (IM97), San Diego, CA, USA, pp. 688-699.
Jain, A., Chang, E.Y. and Wang, Y.-F. (2004). Adaptive stream resource management using Kalman filters, SIGMOD '04: Proceedings of the 2004 ACM SIGMOD International Conference on Management of Data, Paris, France, pp. 11-22.
Jain, N., Kit, D., Mahajan, P., Yalagandula, P., Dahlin, M. and Zhang, Y. (2007). STAR: self-tuning aggregation for scalable monitoring, VLDB '07: Proceedings of the 33rd International Conference on Very Large Data Bases, Vienna, Austria, pp. 962-973.
Kephart, J.O. and Chess, D.M. (2003). The vision of autonomic computing, Computer 36(1): 41-50.
Legrand, I., Newman, H., Voicu, R., Cirstoiu, C., Grigoras, C., Dobre, C., Muraru, A., Costan, A., Dediu, M. and Stratan, C. MonALISA: An agent based, dynamic service system to monitor, control and optimize distributed systems, Computer Physics Communications 180(12): 2472-2498.
Liang, J., Gu, X. and Nahrstedt, K. (2007). Self-configuring information management for large-scale service overlays, INFOCOM 2007: 26th IEEE International Conference on Computer Communications/Joint Conference of the IEEE Computer and Communications Societies, Anchorage, AK, USA, pp. 472-480.
Massie, M., Chun, B. and Culler, D. (2004). The Ganglia distributed monitoring system: Design, implementation, and experience, Parallel Computing 30(7): 817-840.
Nicolae, B., Antoniu, G. and Bougé, L. (2009). Enabling high data throughput in desktop grids through decentralized data and metadata management: The BlobSeer approach, Proceedings of the 15th International Euro-Par Conference, Delft, The Netherlands, pp. 404-416.
Nicolae, B., Antoniu, G., Bougé, L., Moise, D. and CarpenAmarie, A. (2010). BlobSeer: Next generation data management for large scale infrastructures, Journal of Parallel and Distributed Computing 71(2): 168-184.
Parashar, M. and Hariri, S. (2005). Autonomic computing: An overview, in J.-P. Banâtre, P. Fradet, I.-L. Giavitto and O. Michel (Eds.), Unconventional Programming Paradigms, Lecture Notes in Computer Science, Vol. 3566, Springer Berlin/Heidelberg, pp. 247-259.
Santos, Jr., E. and Young, J. D. (1999). Probabilistic temporal networks: A unified framework for reasoning with time and uncertainty, International Journal of Approximate Reasoning 20(3): 263-291.
Steinder, M. and Sethi, A. S. (2004). Probabilistic fault localization in communication systems using belief networks, IEEE/ACM Transactions on Networking 12(5): 809-822.
Tierney, B., Aydt, R. and Gunter, D. (2002). A grid monitoring architecture, Grid Working Draft GWD-PERF-16-3 http://www.gridforum.org/.
Van Renesse, R., Birman, K.P. and Vogels, W. (2003). Astrolabe: A robust and scalable technology for distributed system monitoring, management, and data mining, ACM Transactions on Computer Systems 21(2): 164-206.
Vuran, M.C. and Akyildiz, I.F. (2006). Spatial correlationbased collaborative medium access control in wireless sensor networks, IEEE/ACM Transactions on Networking 14(2): 316-329.
Zanikolas, S. and Sakellariou, R. (2005). A taxonomy of grid monitoring systems, Future Generation Computing Systems 21(1): 163-188.

Identyfikatory

DOI

10.2478/v10006-011-0017-y

Identyfikator YADDA

bwmeta1.element.bwnjournal-article-amcv21i2p229bwm

Artykuł - szczegóły

Czasopismo

International Journal of Applied Mathematics and Computer Science

Tytuł artykułu

Bringing introspection into BlobSeer: Towards a self-adaptive distributed data management system

Autorzy

Treść / Zawartość

Warianty tytułu

Języki publikacji

Abstrakty

Słowa kluczowe

Wydawca

Czasopismo

Rocznik

Tom

Numer

Strony

Opis fizyczny

Daty

Twórcy

Bibliografia

Typ dokumentu

Bibliografia

Identyfikatory

Identyfikator YADDA