Publications

You can also find my articles on my Google Scholar profile.

    2024

  1. Modi, A., Tikmany, R., Malik, T., Komondoor,R., Gehani, A. and D'Souza, D., "Kondo: Efficient Provenance-driven Data Debloating", 40th IEEE International Conference on Data Engineering (ICDE), 2024
  2. 2023

  3. Malik, Tanu, "Reproducible eScience: The Data Containerization Challenge", IEEE eScience, 2023
  4. Nakamura, Y, Kanj, I and Malik, T, "Efficient Differencing of System-level Provenance Graphs", 32nd ACM International Conference on Information and Knowledge Management (CIKM), 2023
  5. Malik, T and Khan, S, "Towards Shareable and Reproducible Cloud Computing Experiments", IEEE CloudSummit, 2023
  6. Modi, A., Reyad, M, Gehani, A., and Malik, T, "Querying Container Provenance", WWW '23 Companion: Companion Proceedings of the ACM Web Conference, 2023
  7. Niddodi, C., Gehani, A., Malik, T., Mohan, S., and Rilee, M., "IOSPReD: I/O Specialized Packaging of Reduced Datasets and Data-Intensive Applications for Efficient Reproducibility", IEEE Access, 2023
  8. 2022

  9. Nakamura, Y. Malik, T. Kanj, I. Gehani, A., "Provenance-based Workflow Diagnostics Using Program Specification", 29th IEEE International Conference on High Performance Computing, Data, and Analytics, pp. 21-31, 12, 2022
  10. Ahmad, R. Manne, N. Malik, T., "Reproducible Notebook Containers using Application Virtualization", 18th IEEE International Conference on eScience, pp. 1-10, 10, 2022
  11. Manne, N. N. Satpati, S. Malik, T. Bagchi, A. Gehani, A. Chaudhary, A. , "CHEX: Multiversion Replay with Ordered Checkpoints", Proceedings of the Very Large Databases (VLDB), vol. 15, pp. 1297-1310, 2, 2022
  12. 2021

  13. That, D. T. Gharehdaghi, M. Rasin, A. Malik, T. , "LDI: Learned Distribution Index for Column Stores", 2021 IEEE International Conference on Big Data (Big Data), pp. 376-387, 12, 2021
  14. Plale, B. A. Malik, T. Pouchard, L. C. , "Reproducibility Practice in High-Performance Computing: Community Survey Results", Computing in Science & Engineering, vol. 23, pp. 55-60, 9, 2021
  15. That, D. H. T. Gharehdaghi, M. Rasin, A. Malik, T. , "On Lowering Merge Costs of an LSM Tree", Proceedings of the 33rd International Conference on Scientific and Statistical Database Management, 7, 2021
  16. Malik, T. , "Artifact Description/Artifact Evaluation: A Reproducibility Bane or a Boon", Proceedings of the 4th International Workshop on Practical Reproducible Evaluation of Computer Systems, pp. 1-1, 6, 2021
  17. Choi, YoungDon and Goodall, Jonathan and Ahmad, Raza and Malik, Tanu and Tarboton, David, "An Approach for Open and Reproducible Hydrological Modeling using Sciunit and HydroShare", EGU General Assembly Conference Abstracts, 4, 2021
  18. 2020

  19. Essawy, B. T. Goodall, J. L. Voce, D. Morsy, M. M. Sadler, J. M. Choi, Y. D. Tarboton, D. G. Malik, T. , "A taxonomy for reproducible and replicable research in environmental modelling", Environmental Modelling & Software, vol. 134, pp. 104753, 12, 2020
  20. Wagner, J. Rasin, A. Malik, T. Grier, J. , "ODSA: Open Database Storage Access", Extending Database Technology (EDBT), 8, 2020
  21. Wagner, J. Rasin, A. Heart, K. Malik, T. Grier, J. , "DF-toolkit: interacting with low-level database storage", Proceedings of the VLDB Endowment, vol. 13, 8, 2020
  22. Niddodi, C. Gehani, A. Malik, T. Navas, J. A. Mohan, S. , "MiDas: Containerizing Data-Intensive Applications with I/O Specialization", Proceedings of the 3rd International Workshop on Practical Reproducible Evaluation of Computer Systems, pp. 21-25, 6, 2020
  23. Chuah, J. Deeds, M. Malik, T. Choi, Y. Goodall, J. L. , "Documenting computing environments for reproducible experiments", Parallel Computing: Technology Trends, pp. 756-765, 2020
  24. Ahmad, R. Nakamura, Y. Manne, N. N. Malik, T. , "{PROV-CRT}: Provenance Support for Container Runtimes", 12th International Workshop on Theory and Practice of Provenance (TaPP 2020), 2020
  25. Nakamura, Y. Ahmad, R. Malik, T. , "Content-defined Merkle Trees for Efficient Container Delivery", 28th IEEE International Conference on High Performance Computing, Data, & Analytics, 2020
  26. Nakamura, Y. Malik, T. Gehani, A. , "Efficient provenance alignment in reproduced executions", 12th International Workshop on Theory and Practice of Provenance (TaPP 2020), 2020
  27. 2019

  28. Youngdahl, A. Ton-That, D. Malik, T. , "SciInc: A Container Runtime for Incremental Recomputation", 2019 15th International Conference on eScience (eScience), pp. 291-300, 9, 2019
  29. Missier, P. Malik, T. Cala, J. , "Report on the first international workshop on incremental re-computation: Provenance and beyond", ACM SIGMOD Record, vol. 47, pp. 35-38, 5, 2019
  30. That, D. H. T. Wagner, J. Rasin, A. Malik, T. , "PLI+: Efficient Clustering of Cloud Databases", Distributed and Parallel Databases, vol. 37, pp. 177-208, 3, 2019
  31. 2018

  32. Sadler, J. Essawy, B. Goodall, J. Voce, D. CHOI, Y. Morsy, M. Yuan, Z. Malik, T. , "Leveraging Scientific Cyberinfrastructures to Achieve Computational Hydrologic Model Reproducibility", AGU Fall Meeting Abstracts, vol. 2018, pp. C13J-1252, 12, 2018
  33. Rasin, A. Malik, T. Wagner, J. Kim, C. , "Where Provenance in Database Storage", International Provenance and Annotation Workshop, pp. 231-235, 7, 2018
  34. Essawy, B. T. Goodall, J. L. Zell, W. Voce, D. Morsy, M. M. Sadler, J. Yuan, Z. Malik, T. , "Integrating scientific cyberinfrastructures to improve reproducibility in computational hydrology: Example for HydroShare and GeoTrust", Environmental Modelling & Software, vol. 105, pp. 217-229, 7, 2018
  35. Pham, Q. Malik, T. That, D. H. T. Youngdahl, A. , "Improving Reproducibility of Distributed Computational Experiments", Proceedings of the First International Workshop on Practical Reproducible Evaluation of Computer Systems, pp. 1-6, 6, 2018
  36. Yuan, Z. That, D. H. T. Kothari, S. Fils, G. Malik, T. , "Utilizing provenance in reusable research objects", Informatics, vol. 5, pp. 14, 3, 2018
  37. Wagner, J. Rasin, A. Heart, K. Malik, T. Furst, J. Grier, J. , "Detecting database file tampering through page carving", 21st International Conference on Extending Database Technology, 3, 2018
  38. Malik, T. Rasin, A. Youngdahl, A. , "Using Provenance for Generating Automatic Citations", 10th USENIX Workshop on the Theory and Practice of Provenance (TaPP 2018), 2018
  39. Essawy, B. T. Goodall, J. L. Morsy, M. M. Zell, W. Sadler, J. Malik, T. Yuan, Z. Voce, D. , "Achieving Reproducible Computational Hydrologic Models by Integrating Scientific Cyberinfrastructures", 9th International Congress on Environmental Modelling and Software, 2018
  40. 2017

  41. Malik, T. Tarboton, D. G. Goodall, J. L. Choi, E. Bhatt, A. Peckham, S. D. Foster, I. That, D. T. Essawy, B. Yuan, Z. Dash, P. Fils, G. Gan, T. Fadugba, O. I. Saxena, A. Valentic, T. A. , "GeoTrust Hub: A Platform For Sharing And Reproducing Geoscience Applications", AGU Fall Meeting Abstracts, vol. 2017, pp. IN43A-0068, 12, 2017
  42. Goodall, J. L. Castronova, A. M. Bandaragoda, C. Morsy, M. M. Sadler, J. M. Essawy, B. Tarboton, D. G. Malik, T. Nijssen, B. Clark, M. P. Liu, Y. Wang, S. , "Cyberinfrastructure to Support Collaborative and Reproducible Computational Hydrologic Modeling", AGU Fall Meeting Abstracts, vol. 2017, pp. H14H-05, 12, 2017
  43. Ton That DH. Fils, G. Yuan, Z. Malik, T. , "Sciunits: Reusable Research Objects", 2017 IEEE 13th International Conference on e-Science (e-Science), pp. 374-383, 10, 2017
  44. Wagner, J. Rasin, A. That, D. H. T. Malik, T. , "PLI: Augmenting live databases with custom clustered indexes", Proceedings of the 29th International Conference on Scientific and Statistical Database Management, pp. 1-6, 6, 2017
  45. Wagner, J. Rasin, A. Malik, T. Heart, K. Jehle, H. Grier, J. , "Database forensic analysis with DBCarver", CIDR 2017, 8th Biennial Conference on Innovative Data Systems Research, 1, 2017
  46. 2016

  47. Balasubramani, B. S. Shivaprabhu, V. R. Krishnamurthy, S. Cruz, I. F. Malik, T. , "Ontology-based urban data exploration", Proceedings of the 2nd ACM SIGSPATIAL Workshop on Smart Cities and Urban Analytics, pp. 1-8, 10, 2016
  48. Li, X. Xu, X. Malik, T. , "Interactive provenance summaries for reproducible science", 2016 IEEE 12th International Conference on e-Science (e-Science), pp. 355-360, 10, 2016
  49. Essawy, B. T. Goodall, J. L. Malik, T. Xu, H. Conway, M. Gil, Y. , "Challenges with Maintaining Legacy Software to Achieve Reproducible Computational Analyses: An Example for Hydrologic Modeling Data Processing Pipelines", iEMSs Conference, 2016
  50. 2015

  51. Malik, T. Foster, I. Goodall, J. L. Peckham, S. D. Baker, J. B. Gurnis, M. , "Personalized, Shareable Geoscience Dataspaces For Simplifying Data Management and Improving Reproducibility", AGU Fall Meeting Abstracts, vol. 2015, pp. IN21E-01, 12, 2015
  52. Pham, Q. Thaler, S. Malik, T. Foster, I. Glavic, B. , "Sharing and reproducing database applications", Proceedings of the VLDB Endowment, vol. 8, pp. 1988-1991, 8, 2015
  53. Madduri, R. Rodriguez, A. Uram, T. Heitmann, K. Malik, T. Sehrish, S. Chard, R. Cholia, S. Paterno, M. Kowalkowski, J. Habib, S. , "PDACS: a portal for data analysis services for cosmological simulations", Computing in Science & Engineering, vol. 17, pp. 18-26, 7, 2015
  54. Meng, H. Kommineni, R. Pham, Q. Gardner, R. Malik, T. Thain, D. , "An invariant framework for conducting reproducible computational science", Journal of Computational Science, vol. 9, pp. 137-142, 7, 2015
  55. Pham, Q. Malik, T. , "GEN: a database interface generator for HPC programs", Proceedings of the 27th International Conference on Scientific and Statistical Database Management, pp. 1-5, 6, 2015
  56. Pham, Q. Malik, T. Glavic, B. Foster, I. , "LDV: Light-weight database virtualization", 2015 IEEE 31st International Conference on Data Engineering, pp. 1179-1190, 4, 2015
  57. 2014

  58. Catlett, C. Malik, T. Goldstein, B. Giuffrida, J. Shao, Y. Panella, A. Eder, D. Zanten, E. v. Mitchum, R. Thaler, S. Foster, I. T. , "Plenario: An Open Data Discovery and Exploration Platform for Urban Science.", IEEE Data Eng. Bull., vol. 37, pp. 27-42, 12, 2014
  59. Malik, T. Chard, K. Tchoua, R. B. Foster, I. , "GeoDataspaces: Simplifying Data Management Tasks with Globus", AGU Fall Meeting Abstracts, vol. 2014, pp. IN34B-08, 12, 2014
  60. Pham, Q. Malik, T. Foster, I. , "Auditing and maintaining provenance in software packages", International Provenance and Annotation Workshop, pp. 97-109, 6, 2014
  61. Malik, T. Chard, K. Foster, I. , "Benchmarking cloud-based tagging services", 2014 IEEE 30th International Conference on Data Engineering Workshops, pp. 231-238, 3, 2014
  62. Malik, T. , "GeoBase: indexing NetCDF files for large-scale data analysis", Big data management, technologies, and applications, pp. 295-313, 2014
  63. Malik, T. Pham, Q. Foster, I. T. Leisch, F. Peng, R. , "SOLE: towards descriptive and interactive publications", Implementing reproducible research, 2014
  64. 2013

  65. Whaling, R. Malik, T. Foster, I. , "Lens: a faceted browser for research networking platforms", 2013 IEEE 9th International Conference on e-Science, pp. 196-203, 10, 2013
  66. Zhao, D. Shou, C. Maliky, T. Raicu, I. , "Distributed data provenance for large-scale data-intensive computing", 2013 IEEE International Conference on Cluster Computing (CLUSTER), pp. 1-8, 9, 2013
  67. Hereld, M. Malik, T. Vishwanath, V. , "Proactive Support for Large-Scale Data Exploration", 2013 IEEE International Symposium on Parallel & Distributed Processing, Workshops and Phd Forum, pp. 2025-2034, 5, 2013
  68. Shou, C. Zhao, D. Malik, T. Raicu, I. , "Towards a provenance-aware distributed filesystem", 5th Workshop on the Theory and Practice of Provenance (TaPP), 2013
  69. Pham, Q. Malik, T. Foster, I. , "Using provenance for repeatability", 5th USENIX Workshop on the Theory and Practice of Provenance (TaPP 13), 2013
  70. Malik, T. Gehani, A. Tariq, D. Zaffar, F. , "Sketching distributed data provenance", Data Provenance and Data Management in eScience, pp. 85-107, 2013
  71. 2012

  72. Malik, T. Foster, I. , "Addressing data access needs of the long-tail distribution of geoscientists", 2012 IEEE International Geoscience and Remote Sensing Symposium, pp. 5348-5351, 7, 2012
  73. Pham, Q. Malik, T. Foster, I. Lauro, R. D. Montella, R. , "SOLE: linking research papers with science objects", International Provenance and Annotation Workshop, pp. 203-208, 6, 2012
  74. Foster, I. Katz, D. S. Malik, T. Fox, P. , "Wagging the long tail of earth science: Why we need an earth science data web, and how to build it", 2012
  75. 2011

  76. Malik, T. Best, N. Elliott, J. Madduri, R. Foster, I. , "Improving the efficiency of subset queries on raster images", Proceedings of the ACM SIGSPATIAL Second International Workshop on High Performance and Distributed Geographic Information Systems, pp. 34-37, 11, 2011
  77. Gehani, A. Tariq, D. Baig, B. Malik, T. , "Policy-based integration of provenance metadata", 2011 IEEE International Symposium on Policies for Distributed Systems and Networks, pp. 149-152, 6, 2011
  78. 2010

  79. Malik, T. Nistor, L. Gehani, A. , "Tracking and sketching distributed data provenance", 2010 IEEE Sixth International Conference on e-Science, pp. 190-197, 12, 2010
  80. Malik, T. Wang, X. Little, P. Chaudhary, A. Thakar, A. , "A Dynamic Data Middleware cache for Rapidly-growing Scientific Repositories", ACM/IFIP/USENIX International Conference on Distributed Systems Platforms and Open Distributed Processing, pp. 64-84, 11, 2010
  81. Wang, X. Perlman, E. Burns, R. Malik, T. Budavári, T. Meneveau, C. Szalay, A. , "JAWS: Job-aware workload scheduling for the exploration of turbulence simulations", SC'10: Proceedings of the 2010 ACM/IEEE International Conference for High Performance Computing, Networking, Storage and Analysis, pp. 1-11, 11, 2010
  82. Venkatasubramanian, V. Malik, T. Giridhar, A. Villez, K. Prasad, R. Shukla, A. Rieger, C. Daum, K. McQueen, M. , "RNEDE: Resilient network design environment", 2010 3rd International Symposium on Resilient Control Systems, pp. 72-75, 8, 2010
  83. Gehani, A. Kim, M. Malik, T. , "Efficient querying of distributed provenance stores", Proceedings of the 19th ACM International Symposium on High Performance Distributed Computing, pp. 613-621, 6, 2010
  84. Malik, T. Prasad, R. Patil, S. Chaudhary, A. Venkatasubramanian, V. , "Providing scalable data services in ubiquitous networks", International Conference on Database Systems for Advanced Applications, pp. 445-457, 4, 2010
  85. 2009

  86. Wang, X. Burns, R. Malik, T. , "Liferaft: Data-driven, batch processing for the exploration of scientific databases", Conference on Innovative Database Research (CIDR), 9, 2009
  87. Malik, T. Wang, X. Dash, D. Chaudhary, A. Ailamaki, A. Burns, R. , "Adaptive physical design for curated archives", International Conference on Scientific and Statistical Database Management, pp. 148-166, 6, 2009
  88. 2008

  89. Krishnamurthy, B. Malik, T. Stamatis, S. Venkatasubramanian, V. Caruthers, J. , "Rule-based classification systems for informatics", 2008 IEEE Fourth International Conference on eScience, pp. 420-421, 12, 2008
  90. Malik, T. Burns, R. , "Workload-Aware histograms for remote applications", International Conference on Data Warehousing and Knowledge Discovery, pp. 402-412, 9, 2008
  91. Malik, T. Wang, X. Burns, R. Dash, D. Ailamaki, A. , "Automated physical design in database caches", 2008 IEEE 24th International Conference on Data Engineering Workshop, pp. 27-34, 4, 2008
  92. Malik, T. , "Large scale data management for the sciences", 2008
  93. 2007

  94. Wang, X. Malik, T. Burns, R. Papadomanolakis, S. Ailamaki, A. , "A workload-driven unit of cache replacement for mid-tier database caching", International Conference on Database Systems for Advanced Applications, pp. 374-385, 4, 2007
  95. Malik, T. Burns, R. C. Chawla, N. V. , "A Black-Box Approach to Query Cardinality Estimation.", CIDR, pp. 56-67, 1, 2007
  96. 2006

  97. Malik, T. Burns, R. Chawla, N. V. Szalay, A. , "Estimating query result sizes for proxy caching in scientific database federations", SC'06: Proceedings of the 2006 ACM/IEEE Conference on Supercomputing, pp. 36-36, 11, 2006
  98. 2005

  99. Malik, T. Burns, R. Chaudhary, A. , "Bypass caching: Making scientific databases good network citizens", 21st International Conference on Data Engineering (ICDE'05), pp. 94-105, 4, 2005
  100. Batsakis, A. Malik, T. Terzis, A. , "Practical passive lossy link inference", International Workshop on Passive and Active Network Measurement, pp. 362-367, 3, 2005
  101. 2002

  102. Szalay, A. S. Budavári, T. Malik, T. Gray, J. Thakar, A. R. , "Web services for the virtual observatory", Virtual Observatories, vol. 4846, pp. 124-132, 12, 2002
  103. Malik, T. Szalay, A. S. Budavari, T. Thakar, A. R. , "SkyQuery: A WebService approach to federate databases", arXiv preprint cs/0211023, 11, 2002
  104. Szalay, A. S. Gray, J. Thakar, A. R. Kunszt, P. Z. Malik, T. Raddick, J. Stoughton, C. , "The SDSS SkyServer - Public Access to the Sloan Digital Sky Server Data", ACM Special Interest Group on Management of Data (SIGMOD), pp. 570-581, 8, 2002