Problem: business requires an report in excel format to be automated in web application.
Final web format:
Original data format:
Key points:
- Data must be summarized by date, week, month, year, and job name dimensions.
- Fact table contains dozen millions of records for several years;
- The reporting requires to rotate the data by 90 degrees to display summarized data
- Multiple data presentations in a single column, witch includes integer number, decimal, and percentage.
Solution:
- Get data from data source
OpenQuery
for linked server does not accept variables, therefore we have to use dynamic
query statement.set @openquery = ' SELECT [Date], [Day], JobName, Volume_to_Work, Volume_Worked, Volume_RPC FROM OPENQUERY([S1],'' SELECT cast(a.Date as Date) [Date], datename(dw,date) as [Day], a.JobName, a.Volume_to_Work, a.Volume_Worked, isnull(b.Volume_RPC,0) as Volume_RPC FROM ( –VOLUME select BACKUPDATE date,JobName,sum(VOLUME_SENT) as Volume_to_Work,sum(VOLUME_WORKED) as Volume_Worked from TBL_METRICS_RAPPORTVOLUME where JobName in ('''''+ replace( case @JOB_NAME_LIST when 'All|' then 'Ucm_Arics_ON_Preview|Ucm_Caprice_QC_Preview|UCM_OB_E_PREVIEW|UCM_OB_F_PREVIEW' else @JOB_NAME_LIST end ,'|',''''',''''') +''''' ) and year(BACKUPDATE) = '+cast(@REPORT_YEAR as varchar)+' group by JobName,BACKUPDATE ) a left join ( –RPC select Backupdate,session,count(session) as Volume_RPC from TBL_ACCOUNTS_WORKED a left join ( select RPC,Result from TBL_REFERENCE_DIALERCODES where site = ''''Internal'''' ) b on a.RELEASECODE= b.RESULT where session in ('''''+ replace( case @JOB_NAME_LIST when 'All|' then 'Ucm_Arics_ON_Preview|Ucm_Caprice_QC_Preview|UCM_OB_E_PREVIEW|UCM_OB_F_PREVIEW' else @JOB_NAME_LIST end ,'|',''''',''''') +''''' ) and year(Backupdate) = '+cast(@REPORT_YEAR as varchar)+' and b.RPC = 1 group by Backupdate,session ) b on a.Date = b.Backupdate and a.JobName = b.session where datename(dw,date) <> ''''Sunday'''''' )' create table #campaign ([Date] date,[Day] varchar(20),JobName varchar(255),Volume_to_work int,Volume_worked int,Volume_RPC int) insert #campaign exec(@openquery) generated openquery look like: SELECT [Date], [Day], JobName, Volume_to_Work, Volume_Worked, Volume_RPC FROM OPENQUERY([BREPP1],' SELECT cast(a.Date as Date) [Date], datename(dw,date) as [Day], a.JobName, a.Volume_to_Work, a.Volume_Worked, isnull(b.Volume_RPC,0) as Volume_RPC FROM ( –VOLUME select BACKUPDATE date,JobName,sum(VOLUME_SENT) as Volume_to_Work,sum(VOLUME_WORKED) as Volume_Worked from [Source_Dialer].[dbo].[TBL_METRICS_RAPPORTVOLUME] where JobName in (''All'',''Ucm_Arics_ON_Preview'',''Ucm_Caprice_QC_Preview'',''UCM_OB_E_PREVIEW'',''UCM_OB_F_PREVIEW'' ) and year(BACKUPDATE) = 2017 group by JobName,BACKUPDATE ) a left join ( –RPC select Backupdate,session,count(session) as Volume_RPC from Source_Dialer.dbo.TBL_METRICS_ACCOUNTS_WORKED a left join ( select RPC,Result from Source_Dialer.dbo.TBL_METRICS_REFERENCE_DIALERCODES where site = ''Internal'') b on cast(a.RELEASECODE as varchar)= b.RESULT where session in (''All'',''Ucm_Arics_ON_Preview'',''Ucm_Caprice_QC_Preview'',''UCM_OB_E_PREVIEW'',''UCM_OB_F_PREVIEW'' ) and year(Backupdate) = 2017 and b.RPC = 1 group by Backupdate,session ) b on a.Date = b.Backupdate and a.JobName = b.session where datename(dw,date) <> ''Sunday''' )
- Get summarized data
select row_number() over(order by d.[JobName],d.date) as Id ,d.[JobName] ,year(d.[Date]) [Year] ,right(cast(d.DateISO as varchar),6) + cast(d.DayOfWeek as varchar) [Weekday] ,d.[Month] ,datepart(wk,d.[Date]) Week ,d.[Date] ,sum(Volume_to_work) Inventory ,sum(Volume_worked) Worked ,sum(Volume_RPC) RPC into #view_campaign from #campaign c right join ( select [JobName],cd.[Date],cd.DateISO,cd.DayOfWeek,cd.month from (select distinct [JobName] from #campaign) c full outer join dbo.tbl_dim_date cd on 1=1 where cd.year=@PRM_REPORT_YEAR ) d on d.[JobName]=c.[JobName] and c.[Date]=d.[Date] group by d.[JobName] ,year(d.[Date]) ,d.[Month] ,datepart(Wk,d.[Date]) ,d.[Date] ,right(cast(d.DateISO as varchar),6) + cast(d.DayOfWeek as varchar)
- Summarized by week, month, year for each jobname, use rollup to get weekly, monthly total columns in final report
- Remove duplicated week that cross months
–Sum the value of the week that cross months (Line38 + Line 44) update #tmp set Inventory=t.Inventory, Worked=t.Worked, rpc=t.rpc from #tmp v join ( select JobName,Agg ,sum([Inventory])[Inventory] ,sum([Worked])[Worked] ,sum([RPC])[RPC] from #tmp where Agg like 'w%' group by Agg,JobName ) t on t.Agg=v.[Agg] where v.agg=t.agg and v.JobName=t.JobName
Delete (Line38) the value of the first week of the duplicated week delete from #tmp where agg+cast([Month] as varchar) in (select agg+cast(min([Month]) as varchar) from #tmp where agg like 'w%' group by JobName,agg having count(agg)>1 )
)
- Add calculated columns: CompleteRate, ReachRate
They are the result of division, so dot not use avg to get wrong resultALTER TABLE #tmp add CompleteRate decimal(14,4), ReachRate decimal(14,4) update #tmp set CompleteRate = case [Inventory] when 0 then 0 else [Worked]*1.0/[Inventory] end, ReachRate = case [Inventory] when 0 then 0 else [RPC]*1.0/[Inventory] end select * from #tmp
- Generate dynamic columns definition and header columns
–@colDesc is for table header, # followed by datatype is for display formatting select @col += ',' + quotename(agg)+' varchar(15)' from #tmp where [year]=@REPORT_YEAR and JobName='All' select @colDesc += ',[' + dbo.[ufn_GetCampaignHeader](agg)+'#string] varchar(15)' from #tmp where [year]=@REPORT_YEAR and JobName='All' select @col = stuff(@col,1,1,''), @colDesc = stuff(@colDesc,1,1,'')
The value of the variable
@col = [YTD2017] varchar (15), [Month1] varchar (15), [2017-01-01] varchar (15), [2017-01-02] varchar (15), [2017-01-03] varchar (15), [2017-01-04] varchar (15), [2017-01-05] varchar (15), [2017-01-06] varchar (15), [2017-01-07] varchar (15), [Wk1] varchar (15), [2017-01-08] varchar (15), [2017-01-09] varchar (15), …… [2017-12-28] varchar (15), [2017-12-29] varchar (15), [2017-12-30] varchar (15), [Wk52] varchar (15), [2017-12-31] varchar (15), [Wk53] varchar (15)
The value of the variable
@colDesc = [YTD2017#string] varchar(15), [Jan#string] varchar(15), [Sun170101#string] varchar(15), [Mon170102#string] varchar(15), [Tue170103#string] varchar(15), [Wed170104#string] varchar(15), [Thu170105#string] varchar(15), [Fri170106#string] varchar(15), …… [Fri171229#string] varchar(15), [Sat171230#string] varchar(15), [Wk52#string] varchar(15), [Sun171231#string] varchar(15), [Wk53#string] varchar(15)
- Create temp summary table to store pivoted final data
--table tmp_summary, add spid value in the global temp table name to isolate concurrent users For using dynamic sql query we have to use global temp table, or we are unable to access the temp table. Seq (sequence) column is for displaying data in order for each job name
declare @tmptable varchar(1000) select @tmptable = '##tmp_summary_' + cast(@@SPID as varchar) exec('if object_id(''tempdb..' + @tmptable + ''') is not null drop table ' + @tmptable) exec('create table ' + @tmptable + ' ([!Year] int,[JobName#string] varchar(max),[!Seq] int,[Campaign#string] varchar(max),'+ @colDesc + ')')
Result:
create table tmp_summary_51 ([Year] int,JobId int,Seq int,JobName varchar(max),[Year2017 Total] varchar(20),[Month1 Total] varchar(20),[2017-01-01] varchar(20),[2017-01-02] varchar(20),[2017-01-03] varchar(20),[2017-01-04] varchar(20),[2017-01-05] varchar(20),[2017-01-06] varchar(20),[2017-01-07] varchar(20),[Week1 Total] varchar(20),[2017-01-08] varchar(20),[2017-01-09] ...... varchar(20),[2017-06-23] varchar(20),[2017-06-24] varchar(20),[Week25 Total] varchar(20),[2017-06-25] varchar(20),[2017-06-26] varchar(20),[2017-06-27] varchar(20),[2017-06-28] varchar(20),[2017-06-29] varchar(20),[2017-06-30] varchar(20),[Week26 Total] varchar(20))
- Turning table in 90 degree with pivot
declare @job varchar(255) declare csr cursor FAST_FORWARD for select Name from @tbl_job open csr fetch next from csr into @job while @@fetch_status = 0 begin -- get column list if @columns is null select @columns = replace (@col,' varchar (15)','') -- rotate -- inventory set @sql ='insert into ' + @tmptable + ' select Year, JobName, 1 seq, Campaign,' + @columns +' from ( select JobName,JobName+'' Inventory'' Campaign,[year],[agg],Inventory from #tmp where JobName = ''' + @job + ''' ) x pivot ( sum(Inventory) for [agg] in ('+@columns+') ) as pvt -- where [year] is not null order by [year], JobName, seq' exec(@sql) -- worked set @sql ='insert into ' + @tmptable + ' select Year, JobName, 2 seq, Campaign,' + @columns +' from ( select JobName, JobName + '' Worked'' Campaign, [year], [agg], Worked from #tmp where JobName = ''' + @job + ''' ) x pivot( sum(worked) for [agg] in ('+@columns+') ) as pvt' exec(@sql) -- % of completion set @sql ='insert into ' + @tmptable + ' select Year, JobName, 3 seq, Campaign,' + @columns +' from ( select JobName,''% of Completion'' Campaign, [year], [agg], CompleteRate from #tmp where JobName = ''' + @job + ''' ) x pivot ( avg(CompleteRate) for [agg] in ('+@columns+') ) as pvt' exec(@sql) --RPC set @sql ='insert into ' + @tmptable + ' select Year, JobName, 4 seq, Campaign,' + @columns +' from ( select JobName,''RPC'' Campaign, [year], [agg], RPC from #tmp where JobName = ''' + @job + ''' ) x pivot ( sum(RPC) for [agg] in ('+@columns+') ) as pvt' exec(@sql) --ReachRate set @sql ='insert into ' + @tmptable + ' select Year,JobName,5 seq, Campaign,' + @columns +' from ( select JobName,''Reach Rate'' Campaign, [year], [agg], ReachRate from #tmp where JobName = ''' + @job + ''' ) x pivot ( avg(ReachRate) for [agg] in ('+@columns+') ) as pvt' exec(@sql) fetch next from csr into @job end close csr deallocate csr
pivoted data look like this:
dynamic pivot statement:
- Rendered on web page
- Report filter:
- 5 temp tables are used
#campaign (Date, Day, JobName, Volume_to_work, Volume_worked, Volume_RPC) -- stores original data #view_campaign (JobName, Year, Weekday, Month, Week, Date, Inventory, Worked, RPC) -- summarized data #tmp (JobName, Year, Month, Week, Agg, Inventory, Worked, RPC) -- target aggregation data ##tmp_summary_spid(Year, JobName, Seq, Campaign, YTD, Jan, Sun, Mon, Tue, … ) -- final output data
- Further action:
Stores pivoted data in a table with ETL process to improve the performance.
Note:
- the data used in this note is test date, not real.
Note: Original post: http://itwebsys.blogspot.com/2017/07/reporting-notes-01.html. Move it here due to google decide to shutdown the blogspot system.