Tea(m)making golden rules

That is brilliant metaphor by adding “m” letter to get 5 golden rules of team building.

teamaking rules

Advertisements
Posted in Uncategorized | Tagged , | Leave a comment

How to avoid LightSwitch change tracking issues

I am using Lightswitch since 2010. Since the beginning it was fast growing tool with amazing development team. I learnt a lot. But even now, some expectations of mine do not correlate with LS team vision. The typical misunderstanding is where should I run my code: on client or on server? Just a reminder, LS has 2 types of clients: HTML5 and Desktop (web or client side). Desktop client  is based on  Silverlight sendbox which brings number of restrictions. HTML5 client is great but has some authentication limitations. I use both of them, it still worth to use Desktop client for rich UI, and HTML one for mobile clients.

The problem with Desktop client – it is restricted by Silverlight sandbox which does not allow me run as many code as I want. That is why I have to offset many code on server side of application. Before version 2013 I used to use extensions, intrinsic datasources or RIA services delegating some client event. But again, it looked nasty. You know, I do not like overcomplicated patterns and my alter ego does not accept that approach.

In version 2013 (or -1 version) LS team added new events for data sources. Now we can use before save/insert/update/delete and after save/insert/update/delete events. It sounds like I found right place where to put my code and send changes to another system in real time. So I tried to use directly. The first problem was: before save events (executing/inserting/updating/deleting) know about changes, but it does not garantee committed transactions. So I had to use after save events (executed/inserting/updating/deleting). But they do not keep committed changes.

Moreover, when I tried to manipulate with dataset Linq requests in before save events I had ValidationException or Context is tracked by another instance exceptions on the client. Finally, I found simple workaround: cache changes in before event and use it after event to send to another system. Here is example:


EntityChangeSet changes = null;
partial void SaveChanges_Executing()
{
changes = this.DataWorkspace.MyDataSource.Details.GetChanges();
}

partial void SaveChanges_Executed()
{
if (changes != null)
{
foreach (var deletedItem in changes.DeletedEntities)
{
MyTargetEntityType item = deletedItem as MyTargetEntityType;
if (item != null && !string.IsNullOrEmpty(item.Value))
{
SendChangesToAnotherSystem(item, "delete");
}
}

foreach (var modifiedItem in changes.ModifiedEntities)
{
MyTargetEntityType item = modifiedItem as MyTargetEntityType;
if (item != null && !string.IsNullOrEmpty(item.Value))
{
SendChangesToAnotherSystem(item, "modify");
}
}

foreach (var addedItem in changes.AddedEntities)
{
MyTargetEntityType item = addedItem as MyTargetEntityType;
if (item != null && !string.IsNullOrEmpty(item.Value))
{
SendChangesToAnotherSystem(item, "upsert"); //add|upsert
}
}
}
changes = null;
}

Keep it simple!

Posted in LightSwitch | Tagged , | Leave a comment

How to clean text based on ASCII range

Ok, how can I clean my text with simple English symbols, let’s say from CHAR(10) to CHAR(125). In other words, if I have text “‘1,2,3…,4,5,•,6,ÿ,7′”, how can I get “1,2,3,4,5,,6,,7” ?

Here is my approach: if text within 64K, it takes less than 1 second. But if you process bigger text size , you have to use N5 set defined below, but in that case it could be 15 seconds,

Keep things simple please!

Here is code:

CREATE FUNCTION [dbo].[f_getASCIIText]
/**************************************************************
CREATED BY : Andrew Butenko
CREATED ON : 20/07/2014
COMMENTS : remove chars out of specified range, for example (10-125)
EXAMPLE : select [dbo].[f_getASCIIText](‘1,2,3…,4,5,•,6,ÿ,7’, 10, 125)

**************************************************************/

(@STRING VARCHAR(MAX),@START INT,@END INT )
RETURNS VARCHAR(MAX) WITH SCHEMABINDING AS
BEGIN
DECLARE @RESULT VARCHAR(MAX)=”
SET @STRING = LTRIM(RTRIM(@STRING))
;WITH N0 as (SELECT 1 as n UNION ALL SELECT 1)
,N1 as (SELECT 1 as n FROM N0 t1, N0 t2)
,N2 as (SELECT 1 as n FROM N1 t1, N1 t2)
,N3 as (SELECT 1 as n FROM N2 t1, N2 t2)
,N4 as (SELECT 1 as n FROM N3 t1, N3 t2)
/*,N5 as (SELECT 1 as n FROM N4 t1, N4 t2)*/
,nums as (SELECT ROW_NUMBER() OVER (ORDER BY (SELECT 1)) as num FROM N4)
SELECT @RESULT = @RESULT + SUBSTRING(@STRING,num,1)
FROM nums
WHERE num <= LEN(@STRING)
AND ASCII(SUBSTRING(@STRING,num,1)) BETWEEN @START AND @END

RETURN @RESULT
END

Posted in SQL | Tagged | Leave a comment

Shoot me if I create that table

I have seen “that” table many times in different systems. It sounds like alien from another planet irresponsible developer creates that table without understanding how many efforts would be spent to maintain that structures. I am talking about organizing date ranges, aka slow changed dimension (SCD). Typical end-user approach is creating structure with the following attributes:

  • [PrimaryKey]
  • [ForeignKey]
  • [DateFrom]
  • [DateTo]
  • [Value]

What is wrong with this obvious implementation? Well, nothing.., but one. That is not a table. That might be view which end-user can observe. But that is unacceptable format for table. Try to avoid using that approach and never store data in that way.

Let me explain in details. As I said, this is common practice, moreover is is recommended by Microsoft in [AdventureWorks].[HumanResources].[EmployeeDepartmentHistory] table. Also, I have seen similar implementation in OpenAir entity tag table. Additionally to mentioned above structure, they have [IsDefault] field (without dates).

Now, let’s pretend that you need to track employment status over time. That means you have default record with employee type (full-time, part-time, sub-contructor, etc.). In terms of discussed model “Employee” is [ForeignKey] and “Employee Type” is [Value].  For some employees they have default records without dates. For other ones they have a combination of default and records with dates. Moreover for some records [DateTo] is not specified. Now,

  • what if you need split that ranges?
  • what is a risk overlapping modified/inserted with existing ones?
  • how to find gaps?
  • what if you need to reformat these ranges to calendar (or anniversary) years?

If you understand combinatorial algorithms, know TSQL and read Itzik Ben-Gan books you will find solution like that:

SELECT a.[user_id]
,a.[EntityTagValue]
,(CASE WHEN a.[start_date] < @FromDate THEN @FromDate ELSE a.[start_date] END) AS [start_date]
,(CASE WHEN a.[end_date] > @ToDate THEN @ToDate ELSE a.[end_date] END) AS [end_date]
,DATEDIFF(DAY,a.[start_date],a.[end_date]) AS [EntityTagDays]
,DATEDIFF(DAY, @FromDate, @ToDate)+1 AS [TotalDays]
,DATEDIFF(DAY,(CASE WHEN a.[start_date] < @FromDate THEN @FromDate ELSE a.[start_date] END)
,(CASE WHEN a.[end_date] > @ToDate THEN @ToDate ELSE a.[end_date] END))+1 AS [Days]
,CONVERT(MONEY,DATEDIFF(DAY,(CASE WHEN a.[start_date] < @FromDate THEN @FromDate ELSE a.[start_date] END)
,(CASE WHEN a.[end_date] > @ToDate THEN @ToDate ELSE a.[end_date] END))+1) / (DATEDIFF(DAY, @FromDate, @ToDate)+1) AS [Weight]
–,COUNT(DISTINCT cld.[DateKey]) AS [Days]
FROM L40 a
WHERE ( (a.[start_date] BETWEEN @FromDate AND @ToDate)
OR (a.[end_date] BETWEEN @FromDate AND @ToDate)
OR (@FromDate BETWEEN a.[start_date] AND a.[end_date])
OR (@ToDate BETWEEN a.[start_date] AND a.[end_date])
)

What a mess 🙂

road signs

Why it is so complicated? It must be simple like that:

monkey

Here is right decision:

  • [PrimaryKey]
  • [ForeignKey]
  • [AsOfDate]
  • [Value]

All you need is effective date for new value. That is it. You always can build ranges with start and end date based on that approach. You can inject (split ranges) new record between existing ones, or remove (merge ranges). You can prevent overlapping or duplicated dates. You can build the view just like that:

WITH Sequence AS (–add sequence id
SELECT [ForeignKey], [AsOfDate], [Value], NULLIF(ROW_NUMBER() OVER(PARTITION BY [ForeignKey] ORDER BY [AsOfDate]),0) [SeqId]
FROM [Table]
)
–build ranges
SELECT [left].[ForeignKey], [left].[AsOfDate] AS [DateFrom], ISNULL(DATEADD(DAY,-1, [right].[AsOfDate]), GETDATE()) AS [DateTo], [left].[Value]
FROM Sequence AS [left]
LEFT OUTER JOIN Sequence AS [right] ON [left].[ForeignKey] = [right].[ForeignKey]
AND [left].[SeqId] = [right].[SeqId] – 1

Please, use simple algorithms!

Posted in SQL, Uncategorized | Tagged , , , | Leave a comment

Data Driven Subscription Standard Edition

Data Driven Subscription (DDS) is SSRS feature available in SQL Server Enterprise Edition only. It allows rendering many copies of the same report based on combination of parameters and deliver them by query defined destinations (emails or file location). DDS checks if report has data and pushes that like notification. That means other SQL Server editions lack of that feature. For that reason (+ customization), I have created small solution which allows filling the gap.

As always, implementation is build on assumptions (in other words, based on specific interfaces).  Here is a list of assumptions:

  • Not every report needs DDS.
  • If report requires DDS, it must:
    • have parameter UserId
    • have subscription with dedicated shared schedule. That schedule should be based on day with specific time, for example 8:00AM (user local time, not server time).
  • DDS report can have many subscriptions. Shared schedule can be used in many subscriptions
  • Dedicated shared schedule should be:
    • named with prefix, for example, “DataDrivenSubscription.”.
    • paused, which means SQL Server would not call that schedule.
  • UserId parameter should be based on generic lookup list of users. For example, database view [dbo].[vw_ReportSubscribers]. It should have the following fields:
    • Id. It is used as DDS report parameter.
    • Email. It is used for report delivery
    • DisplayName. Is used as part of email invitation
    • TimeZone. It allows sending report based on end user local time
  • As far as shared schedule is paused, another SQL Server job has to invoke report rendering. That managing job should use frequent schedule (every 5-10 minutes) simulating long pooling. That job contains one step with SSIS package which loops user parameter collection and makes a decision which UserId report render and send.
  • DDS should evaluate report data. For that reason the report subscription [Comments] field should have SQL query which check if report has data. If report has no data – it should not be rendered, no report data -no delivery.

The following diagram describes the process:

DDS diagram

Every X minutes SQL Server agent job runs SSIS package which iterates through users collection. Stored procedure, let’s call it [dbo].[GetRenderSettingsData], checks if report has data for that user and generates xml which is used by Unit Testing SSRS Tool to render the report.

When report’s file is rendered by the tool, it is sent by SSIS package email task to end user. The delivery is based on user local time (this feature is unique for this solution and does not exist in SSRS Enterprise Edition).

Posted in SQL, SSIS, SSRS | Tagged | 1 Comment

How to get enterprise features within budget edition. SSRS

Typical question which I was asked many times. Ok, here is my report, can you send it to XXX when YYY is happened?

Usually I say, sure we can use data driven subscription and … wait a minute, do you have SSRS express/standard edition? Data driven subscription is available in enterprise edition (EE) only. You can’t afford EE? Me too :(…

Calm down, deep breath and think. Remember my Unit Testing SSRS tool. It could be used not for testing only, but allows organising complex / customized subscription, managing collection of linked reports, etc. 

In addition to the tool, I would like to show another simple trick. Let’s say, you have report named “My Tasks Report” based on data in table [Tasks]. You deploy the report and create static subscription, configure parameters and  schedule it to be delivered every morning. Next morning you receive empty report. Well, no new tasks assigned to you yet. How to avoid sending blank report? Data driven subscriptions! Again?

Let’s have a look at static subscription internals. When you create new subscription SSRS generates SQL Server Agent job named like “250632ed-bf84-48f4-b904-42177685adaf”, which does not look fancy but is unique. That job has only 1 step like the following:

exec [ReportServer].[dbo].[AddEvent] @EventType=’TimedSubscription’, @EventData=’d9ead81c-990b-4f6c-9e33-3e021c9b9234′

When you stop or pause subscription, the job is deleted. When you resume subscription, new GUID based name job is generated.

That means you should not add extra steps to the job to prevent sending blank reports. It will be deleted next time when use pause/resume subscription. Sounds not good? But, that is ok. Stored procedure [dbo].[AddEvent] is pretty straightforward. It adds new event, the event is hooked up by SSRS windows service timer (yes, they have windows service thread which constantly checks new events in that table and process it). After processing the event is deleted from the [Event] table. Here is my solution. Just add some logic to stored procedure to prevent adding blank events. That is it. Here is example:

ALTER PROCEDURE [dbo].[AddEvent]
@EventType nvarchar (260),
@EventData nvarchar (260)
AS
—-start of modification———–
IF EXISTS (SELECT TOP 1 1
FROM [dbo].[Subscriptions] subs
INNER JOIN [dbo].[Catalog] catl ON catl.[ItemID] = subs.Report_OID
AND catl.Name = ‘My Tasks Report
WHERE subs.[SubscriptionID] = @EventData
)
AND NOT EXISTS (SELECT TOP 1 1 FROM [MyDBName].[dbo].[Tasks] WHERE [AssignedTo] = ‘Me’ AND [AssignedDate]>=CONVERT(DATE,GETDATE())
BEGIN
   PRINT ‘This subscription report has no data. The event would not be fired.’
   RETURN;
END

—-end of modification———–
insert into [Event]
([EventID], [EventType], [EventData], [TimeEntered], [ProcessStart], [BatchID])
values
(NewID(), @EventType, @EventData, GETUTCDATE(), NULL, NULL)
GO

I believe, right combination of tools and tricks could compensate report subscription requirements even with express/standard edition of SSRS, making your customer happy.

Posted in Uncategorized | Tagged , | Leave a comment

Import dirty csv file

Let’s pretend that you need to import neat.csv file (provided by external vendor/partner/system) to your database table . You must be lucky to import data like this:
====================
Field1,Field2,Field3,Field4
1,”A”,10.0,2013/10/01
2,”B”,20.0,2013/10/02
3,””,0,2013/10/03
~~~~~~~~~~~~~~~

As you can see, here is 3 records with 4 fields to import. But what about ugly.csv file with multi-line text, number of unexpected quotes and delimiters, blank values and unpredictable row columns size?
~~~~~~~~~~~~~~~
Field1,Field2,Field3,Field4
1,”A
ghgh ghhgh,,,,,,,,,,,, ,,,,,,,
A’ha “,10.0, 53-44-1008
2 , “B “”Br
rrrrr}dd {} “”//
aghhhrr”” Bue”,20.0,2013/10/02 14:00:56

3, ,       , , , , ,,

~~~~~~~~~~~~~~~

Do you like it? Are you steel ready to import this rubbish? If not, please stop reading now :).

Welcome back! You are brave person if you read this post. Here is a plan:

  1. Clean data
  2. Load data
  3. Swap data

Let’s dig into each step.

Clean data

Put all row data in one line. Replace extra delimiters with temporary ones. You can clean text using Replace() function. But how can you distinct new row from multi-line text area? You need “smart regular expressions” to find row patterns. And “smart parser” to counts delimiters.

Load data

Validate and import data. How you can load rows? Using iterator to load one row by one? Or using “smart bulk load”?

Swap data

How you can load data for the whole table without locking alive users? You need “smart swap algorithm”. For example using current->old, new->current pattern. Replace text based fields temporary delimiters with original ones.

To many “smart” buzz words? Well, that is a price you pay for dirty data. I am sure, you are smart enough to choose appropriate framework/language/engine to resolve this  puzzle. That is why I do not want to force my solution. But just imaging that you need some freedom and custom rules which other vendors do not provide.

Let me show my algorithm code metrics:

ParserCodeMetricsUsing this approach I was able to load data in 15 minutes which was originally loaded by another solution in 5-8 hours.

I still believe in algorithm.

Happy coding!

Posted in C#, SQL, SSIS | 1 Comment