How to highlight searching words in text

Here is quick example how to highlight searching words using full text search in TSQL. It also avoids highlighting stop words
Author: Andrew Butenko
Create date: 14-01-2015
Description: highlight searching words in text including forms and excluding stop words

DECLARE @contents VARCHAR(MAX) = '123 defence ARMV-AT. NAVY-as, MC. - STRAIGHY, NEAR SOLDER CONTACTS. AN TYPE buiion 3: ', @searchTerm varchar(255) = 'defence navy'
SELECT [dbo].[HighLightSearchWords](@contents,@searchTerm)
CREATE FUNCTION [dbo].[HighLightSearchWords]
@contents VARCHAR(8000),
@searchTerm varchar(255)
DECLARE @retval VARCHAR(8000) = ''
;WITH Forms AS (
SELECT DISTINCT prs.display_term
FROM sys.dm_fts_parser(N'FORMSOF(FREETEXT, "' + @searchTerm + '")', 1033, 0, 1) prs
LEFT OUTER JOIN [sys].[fulltext_stopwords] stw ON prs.display_term COLLATE Latin1_General_CI_AS = stw.[stopword] COLLATE Latin1_General_CI_AS
WHERE stw.[stopword] IS NULL
, Matches(display_term, start_pos) AS(
SELECT display_term, CHARINDEX(display_term, @contents, 0) as start_pos
FROM Forms
WHERE CHARINDEX(display_term, @contents, 0)>0
SELECT m.display_term, CHARINDEX(m.display_term, @contents, m.start_pos+LEN(m.display_term)) as start_pos
FROM Matches AS m
WHERE CHARINDEX(m.display_term, @contents, m.start_pos+LEN(m.display_term))>0)
, Results AS(
SELECT ROW_NUMBER() OVER(ORDER BY start_pos) AS Id, start_pos, MAX(LEN(display_term)) as term_length
FROM Matches
GROUP BY start_pos)
SELECT @retval += (CASE l.Id WHEN 1 THEN SUBSTRING(@contents, 1, l.start_pos-1) ELSE '' END) --left
+'' + SUBSTRING(@contents, l.start_pos, l.term_length)+'' --middle
+(CASE WHEN r.start_pos - (l.start_pos + l.term_length + 1) > 0 THEN SUBSTRING(@contents, l.start_pos + l.term_length +1, r.start_pos - (l.start_pos + l.term_length + 1) )
ELSE SUBSTRING(@contents, l.start_pos + l.term_length, LEN(@contents))
END) --right
FROM Results l
LEFT OUTER JOIN Results r ON l.Id = r.Id - 1
ORDER BY l.start_pos
RETURN @retval

Posted in SQL | Tagged , | Leave a comment

Tea(m)making golden rules

That is brilliant metaphor by adding “m” letter to get 5 golden rules of team building.

teamaking rules

Posted in Uncategorized | Tagged , | Leave a comment

How to avoid LightSwitch change tracking issues

I am using Lightswitch since 2010. Since the beginning it was fast growing tool with amazing development team. I learnt a lot. But even now, some expectations of mine do not correlate with LS team vision. The typical misunderstanding is where should I run my code: on client or on server? Just a reminder, LS has 2 types of clients: HTML5 and Desktop (web or client side). Desktop client  is based on  Silverlight sendbox which brings number of restrictions. HTML5 client is great but has some authentication limitations. I use both of them, it still worth to use Desktop client for rich UI, and HTML one for mobile clients.

The problem with Desktop client – it is restricted by Silverlight sandbox which does not allow me run as many code as I want. That is why I have to offset many code on server side of application. Before version 2013 I used to use extensions, intrinsic datasources or RIA services delegating some client event. But again, it looked nasty. You know, I do not like overcomplicated patterns and my alter ego does not accept that approach.

In version 2013 (or -1 version) LS team added new events for data sources. Now we can use before save/insert/update/delete and after save/insert/update/delete events. It sounds like I found right place where to put my code and send changes to another system in real time. So I tried to use directly. The first problem was: before save events (executing/inserting/updating/deleting) know about changes, but it does not garantee committed transactions. So I had to use after save events (executed/inserting/updating/deleting). But they do not keep committed changes.

Moreover, when I tried to manipulate with dataset Linq requests in before save events I had ValidationException or Context is tracked by another instance exceptions on the client. Finally, I found simple workaround: cache changes in before event and use it after event to send to another system. Here is example:

EntityChangeSet changes = null;
partial void SaveChanges_Executing()
changes = this.DataWorkspace.MyDataSource.Details.GetChanges();

partial void SaveChanges_Executed()
if (changes != null)
foreach (var deletedItem in changes.DeletedEntities)
MyTargetEntityType item = deletedItem as MyTargetEntityType;
if (item != null && !string.IsNullOrEmpty(item.Value))
SendChangesToAnotherSystem(item, "delete");

foreach (var modifiedItem in changes.ModifiedEntities)
MyTargetEntityType item = modifiedItem as MyTargetEntityType;
if (item != null && !string.IsNullOrEmpty(item.Value))
SendChangesToAnotherSystem(item, "modify");

foreach (var addedItem in changes.AddedEntities)
MyTargetEntityType item = addedItem as MyTargetEntityType;
if (item != null && !string.IsNullOrEmpty(item.Value))
SendChangesToAnotherSystem(item, "upsert"); //add|upsert
changes = null;

Keep it simple!

Posted in LightSwitch | Tagged , | Leave a comment

How to clean text based on ASCII range

Ok, how can I clean my text with simple English symbols, let’s say from CHAR(10) to CHAR(125). In other words, if I have text “‘1,2,3…,4,5,•,6,ÿ,7′”, how can I get “1,2,3,4,5,,6,,7” ?

Here is my approach: if text within 64K, it takes less than 1 second. But if you process bigger text size , you have to use N5 set defined below, but in that case it could be 15 seconds,

Keep things simple please!

Here is code:

CREATED BY : Andrew Butenko
CREATED ON : 20/07/2014
COMMENTS : remove chars out of specified range, for example (10-125)
EXAMPLE : select [dbo].[f_getASCIIText](‘1,2,3…,4,5,•,6,ÿ,7’, 10, 125)


,N1 as (SELECT 1 as n FROM N0 t1, N0 t2)
,N2 as (SELECT 1 as n FROM N1 t1, N1 t2)
,N3 as (SELECT 1 as n FROM N2 t1, N2 t2)
,N4 as (SELECT 1 as n FROM N3 t1, N3 t2)
/*,N5 as (SELECT 1 as n FROM N4 t1, N4 t2)*/
FROM nums


Posted in SQL | Tagged | Leave a comment

Shoot me if I create that table

I have seen “that” table many times in different systems. It sounds like alien from another planet irresponsible developer creates that table without understanding how many efforts would be spent to maintain that structures. I am talking about organizing date ranges, aka slow changed dimension (SCD). Typical end-user approach is creating structure with the following attributes:

  • [PrimaryKey]
  • [ForeignKey]
  • [DateFrom]
  • [DateTo]
  • [Value]

What is wrong with this obvious implementation? Well, nothing.., but one. That is not a table. That might be view which end-user can observe. But that is unacceptable format for table. Try to avoid using that approach and never store data in that way.

Let me explain in details. As I said, this is common practice, moreover is is recommended by Microsoft in [AdventureWorks].[HumanResources].[EmployeeDepartmentHistory] table. Also, I have seen similar implementation in OpenAir entity tag table. Additionally to mentioned above structure, they have [IsDefault] field (without dates).

Now, let’s pretend that you need to track employment status over time. That means you have default record with employee type (full-time, part-time, sub-contructor, etc.). In terms of discussed model “Employee” is [ForeignKey] and “Employee Type” is [Value].  For some employees they have default records without dates. For other ones they have a combination of default and records with dates. Moreover for some records [DateTo] is not specified. Now,

  • what if you need split that ranges?
  • what is a risk overlapping modified/inserted with existing ones?
  • how to find gaps?
  • what if you need to reformat these ranges to calendar (or anniversary) years?

If you understand combinatorial algorithms, know TSQL and read Itzik Ben-Gan books you will find solution like that:

SELECT a.[user_id]
,(CASE WHEN a.[start_date] < @FromDate THEN @FromDate ELSE a.[start_date] END) AS [start_date]
,(CASE WHEN a.[end_date] > @ToDate THEN @ToDate ELSE a.[end_date] END) AS [end_date]
,DATEDIFF(DAY,a.[start_date],a.[end_date]) AS [EntityTagDays]
,DATEDIFF(DAY, @FromDate, @ToDate)+1 AS [TotalDays]
,DATEDIFF(DAY,(CASE WHEN a.[start_date] < @FromDate THEN @FromDate ELSE a.[start_date] END)
,(CASE WHEN a.[end_date] > @ToDate THEN @ToDate ELSE a.[end_date] END))+1 AS [Days]
,CONVERT(MONEY,DATEDIFF(DAY,(CASE WHEN a.[start_date] < @FromDate THEN @FromDate ELSE a.[start_date] END)
,(CASE WHEN a.[end_date] > @ToDate THEN @ToDate ELSE a.[end_date] END))+1) / (DATEDIFF(DAY, @FromDate, @ToDate)+1) AS [Weight]
–,COUNT(DISTINCT cld.[DateKey]) AS [Days]
FROM L40 a
WHERE ( (a.[start_date] BETWEEN @FromDate AND @ToDate)
OR (a.[end_date] BETWEEN @FromDate AND @ToDate)
OR (@FromDate BETWEEN a.[start_date] AND a.[end_date])
OR (@ToDate BETWEEN a.[start_date] AND a.[end_date])

What a mess 🙂

road signs

Why it is so complicated? It must be simple like that:


Here is right decision:

  • [PrimaryKey]
  • [ForeignKey]
  • [AsOfDate]
  • [Value]

All you need is effective date for new value. That is it. You always can build ranges with start and end date based on that approach. You can inject (split ranges) new record between existing ones, or remove (merge ranges). You can prevent overlapping or duplicated dates. You can build the view just like that:

WITH Sequence AS (–add sequence id
SELECT [ForeignKey], [AsOfDate], [Value], NULLIF(ROW_NUMBER() OVER(PARTITION BY [ForeignKey] ORDER BY [AsOfDate]),0) [SeqId]
FROM [Table]
–build ranges
SELECT [left].[ForeignKey], [left].[AsOfDate] AS [DateFrom], ISNULL(DATEADD(DAY,-1, [right].[AsOfDate]), GETDATE()) AS [DateTo], [left].[Value]
FROM Sequence AS [left]
LEFT OUTER JOIN Sequence AS [right] ON [left].[ForeignKey] = [right].[ForeignKey]
AND [left].[SeqId] = [right].[SeqId] – 1

Please, use simple algorithms!

Posted in SQL, Uncategorized | Tagged , , , | Leave a comment

Data Driven Subscription Standard Edition

Data Driven Subscription (DDS) is SSRS feature available in SQL Server Enterprise Edition only. It allows rendering many copies of the same report based on combination of parameters and deliver them by query defined destinations (emails or file location). DDS checks if report has data and pushes that like notification. That means other SQL Server editions lack of that feature. For that reason (+ customization), I have created small solution which allows filling the gap.

As always, implementation is build on assumptions (in other words, based on specific interfaces).  Here is a list of assumptions:

  • Not every report needs DDS.
  • If report requires DDS, it must:
    • have parameter UserId
    • have subscription with dedicated shared schedule. That schedule should be based on day with specific time, for example 8:00AM (user local time, not server time).
  • DDS report can have many subscriptions. Shared schedule can be used in many subscriptions
  • Dedicated shared schedule should be:
    • named with prefix, for example, “DataDrivenSubscription.”.
    • paused, which means SQL Server would not call that schedule.
  • UserId parameter should be based on generic lookup list of users. For example, database view [dbo].[vw_ReportSubscribers]. It should have the following fields:
    • Id. It is used as DDS report parameter.
    • Email. It is used for report delivery
    • DisplayName. Is used as part of email invitation
    • TimeZone. It allows sending report based on end user local time
  • As far as shared schedule is paused, another SQL Server job has to invoke report rendering. That managing job should use frequent schedule (every 5-10 minutes) simulating long pooling. That job contains one step with SSIS package which loops user parameter collection and makes a decision which UserId report render and send.
  • DDS should evaluate report data. For that reason the report subscription [Comments] field should have SQL query which check if report has data. If report has no data – it should not be rendered, no report data -no delivery.

The following diagram describes the process:

DDS diagram

Every X minutes SQL Server agent job runs SSIS package which iterates through users collection. Stored procedure, let’s call it [dbo].[GetRenderSettingsData], checks if report has data for that user and generates xml which is used by Unit Testing SSRS Tool to render the report.

When report’s file is rendered by the tool, it is sent by SSIS package email task to end user. The delivery is based on user local time (this feature is unique for this solution and does not exist in SSRS Enterprise Edition).

Posted in SQL, SSIS, SSRS | Tagged | 1 Comment

How to get enterprise features within budget edition. SSRS

Typical question which I was asked many times. Ok, here is my report, can you send it to XXX when YYY is happened?

Usually I say, sure we can use data driven subscription and … wait a minute, do you have SSRS express/standard edition? Data driven subscription is available in enterprise edition (EE) only. You can’t afford EE? Me too :(…

Calm down, deep breath and think. Remember my Unit Testing SSRS tool. It could be used not for testing only, but allows organising complex / customized subscription, managing collection of linked reports, etc. 

In addition to the tool, I would like to show another simple trick. Let’s say, you have report named “My Tasks Report” based on data in table [Tasks]. You deploy the report and create static subscription, configure parameters and  schedule it to be delivered every morning. Next morning you receive empty report. Well, no new tasks assigned to you yet. How to avoid sending blank report? Data driven subscriptions! Again?

Let’s have a look at static subscription internals. When you create new subscription SSRS generates SQL Server Agent job named like “250632ed-bf84-48f4-b904-42177685adaf”, which does not look fancy but is unique. That job has only 1 step like the following:

exec [ReportServer].[dbo].[AddEvent] @EventType=’TimedSubscription’, @EventData=’d9ead81c-990b-4f6c-9e33-3e021c9b9234′

When you stop or pause subscription, the job is deleted. When you resume subscription, new GUID based name job is generated.

That means you should not add extra steps to the job to prevent sending blank reports. It will be deleted next time when use pause/resume subscription. Sounds not good? But, that is ok. Stored procedure [dbo].[AddEvent] is pretty straightforward. It adds new event, the event is hooked up by SSRS windows service timer (yes, they have windows service thread which constantly checks new events in that table and process it). After processing the event is deleted from the [Event] table. Here is my solution. Just add some logic to stored procedure to prevent adding blank events. That is it. Here is example:

ALTER PROCEDURE [dbo].[AddEvent]
@EventType nvarchar (260),
@EventData nvarchar (260)
—-start of modification———–
FROM [dbo].[Subscriptions] subs
INNER JOIN [dbo].[Catalog] catl ON catl.[ItemID] = subs.Report_OID
AND catl.Name = ‘My Tasks Report
WHERE subs.[SubscriptionID] = @EventData
AND NOT EXISTS (SELECT TOP 1 1 FROM [MyDBName].[dbo].[Tasks] WHERE [AssignedTo] = ‘Me’ AND [AssignedDate]>=CONVERT(DATE,GETDATE())
   PRINT ‘This subscription report has no data. The event would not be fired.’

—-end of modification———–
insert into [Event]
([EventID], [EventType], [EventData], [TimeEntered], [ProcessStart], [BatchID])
(NewID(), @EventType, @EventData, GETUTCDATE(), NULL, NULL)

I believe, right combination of tools and tricks could compensate report subscription requirements even with express/standard edition of SSRS, making your customer happy.

Posted in Uncategorized | Tagged , | Leave a comment