Skip to content

computing total storage size of a folder in azure data lake storage

A few days back we needed to calculate how much data have we ingested into our data lake by each project. And that’s when I realized there is no direct way to get the size of any directory in Azure Datalake Storage. Storage explore allows you to get the statistics of the folder which shows the size, however, imagine that doing for 100 folders. And so, I thought to write a script.

Following PowerShell script will give you the size of all the folders under the given path.

$path="/infomart"
$account="azueus2dev"
$ChildPaths=(Get-AzureRmDataLakeStoreChildItem -Account "azueus2devadlsdatalake" -path $path).Name
foreach($ChildPath in $ChildPaths){
	$length=(Get-AzureRmDataLakeStoreChildItemSummary -Account $account -path "$path/$ChildPath" -Concurrency 128).Length
	"$path/$ChildPath, $length" | Out-File $path.txt -Append
}

You will need to install AzureRM.DataLakeStore module to run above script.

install-module -name AzureRM.DataLakeStore

Published inUncategorized

Be First to Comment

Leave a Reply

Your email address will not be published. Required fields are marked *